Coulis Espoaaboh the 
canara i 








CONSISTENCY AND UNBIASEDNESS OF CERTAIN 
NONPARAMETRIC TESTS 


By E. L. LEHMANN 
University of California, Berkeley 


1. Summary. It is shown that there exist strictly unbiased and consistent 
tests for the univariate and multivariate two- and k-sample problem, for the 
hypothesis of independence, and for the hypothesis of symmetry with respect 
to a given point. Certain new tests for the univariate two-sample problem are 
discussed. The large sample power of these tests and of the Mann-Whitney test 
are obtained by means of a theorem of Hoeffding. There is a discussion of the 
problem of tied observations. 

2. Introduction. The purpose of the present paper is to investigate the exist- 
ence and various properties of strictly unbiased and of consistent tests for testing 
certain nonparametric hypotheses. The problems that will be considered are 
the two-sample and k-sample problem, the hypothesis of independence and the 
hypothesis of symmetry with respect to a given point. 

A sequence of tests is said to be consistent against a certain class of alternatives 
if for each alternative the power of the test tends to one as the sample sizes tend 
to infinity. A test will be said to be strictly unbiased if the power for each alterna- 
tive exceeds the level of significance. 

Consistency being a rather weak property, which one would expect most se- 
quences of tests to satisfy for the class of alternatives for which they are designed, 
it is important to obtain some more detailed information concerning the power 
of the various tests under consideration. Because of the tremendous variety of 
the alternatives it seems fairly hopeless to get a comprehensive view of the 
achievements of most tests when the samples are small. This in spite of the fact 
that it is occasionally possible to write down the power explicitly (for example 
in the simplest cases of the tests discussed by Mathisen {1]). On the other hand, 
the large sample distribution of a number of test statistics may be found by 
means of the asymptotic theorems of Hoeffding [2]. Asymptotically, the power 
then usually involves only a few parameters and a large sample comparison of 
various different tests becomes possible. 

3. Two-sample problem: specific classes of alternatives. We shall discuss 
in detail only one of the problems mentioned, the two-sample problem, and 
indicate only briefly certain extensions to the other problems. In the two-sample 
problem one is given independent samples X,, --- , X, and Y,,---, Y, from 
populations with unknown cumulative distribution functions F and G respec- 
tively, and it is desired to test the hypothesis F = G. In this connection various 
classes of alternatives are possible. 

It may, for example, be known that unless F = G, the Y’s tend to be larger 
than the X’s. For this problem it has been proposed as a test to consider the 

165 





166 E. L. LEHMANN 


number of pairs X;, Y; for which X; < Y;, and to reject the hypothesis if this 
number is too large. This test was proved consistent by Mann and Whitney 
[3] against the alternatives that 


(3.1) F(t) > Git) forall ¢. 


Actually their proof shows that the test is consistent! against all alternatives 
for which P(Y; > X,) > 3. 

We shall now prove that this test is also unbiased against the alternatives 
satisfying (3.1).2 This is true not only for this test but also for those proposed 
by Thompson [4] and for tests based on randomisation of such statistics as 7 — Z. 
In fact we have 

THEOREM 3.1. Let w be any similar region for testing H: F = G on the basis of 
X1,°°*,Xm3¥1,°°:, Yn. Suppose w ts such that (11, +++, %mj321,°°* ,2n) EW 
and y; = 2; fori = 1,--- , n implies (11, °:+,%m3Y1,°°* » Yn) € w. Then the 
test is unbiased against all continuous alternatives F, G satisfying (3.1). 

Proor. Suppose that X,,---, Xm, Yi,--:, Yn ave independent and all 
have the same c.d.f. F and that G is such that (3.1) holds. Then we shall con- 
struct Y; = f(Z;) such that Y; > Z; for i = 1,---,m and such that the Y’s 
have c.d.f. G. Thus the probability of (Xi, ---, Xm; Z2:,--:, Zn) €w equals 
the level of significance, a say, while the probability of (Xi ,---,Xm;Yi,---, 
Y,) ¢ w equals the power of the test against the alternative (/, @). But since 
Y; > Z; for all z, the test rejects for the X’s and Y’s whenever it rejects for the 
X’s and Z’s, and hence the power is 2 a. 

The function f is easily defined by the equation 


G(f(z)) = F). 


(When this does not define f(z) uniquely, it does not matter which of the possible 
definitions is used.) That y = f(z) > z follows from assumption (3.1). 


The theorem as stated refers only to tests in which no randomisation is al- 
lowed, but the extension to randomised tests is immediate. Also, as we shall show 
later, the assumption of continuity of F and G may be omitted. 

Theorem 3.1 may be used also to widen the applicability of the tests to which 
it refers. So far, we have taken the hypothesis to state that X and Y have the 
same distribution. This formulation may arise, for example, when one is faced 
with the question whether a treatment, known to be harmless, has a beneficial 
effect: Either it has no effect so that F = G, or it has a good effect. If, on the 
other hand, the comparison is between two different treatments one may wish 
to test hypothesis H’ that Y tends to be smaller than X, against the alternatives 
that it tends to be larger. The hypothesis would then be 


H': F(t) = Gt) forall ¢. 

1 This was also noticed by van Dantzig who points it out in a paper “On the consistency 
and the power function of Wilcoxon’s two sample test,” to be published in Proc. Roy. Inst. 
Acad. Sci., 1951 

2 For alternatives (F, @) differing only in location this was proved by Van der Vaart [26]. 





NONPARAMETRIC TESTS 167 


There is of course no nontrivial similar region for this problem, however any 
region w satisfying the condition of Theorem 3.1 and such that P(w) = @ when- 
ever F # G clearly will be of size a for testing H’ i.e. P(w) will be S a whenever 
F(t) S G(t) for all t. 

Returning to the Mann-Whitney test, let us define V by 


(3.2) mnV = number of pairs X;, Y; with X; < Y;. 


It was shown by Mann and Whitney that V is asymptotically normally dis- 
tributed when F = G and m, n — & in an arbitrary manner. From a result of 
Hoeffding (Theorem 7.3 of [2}) it follows that asymptotic normality holds also 
when F # G provided m/n remains constant as m,n — ©. 

We shall apply Hoeffding’s theorem to prove asymptotic normality not only 
of V, but of a large class of statistics connected with the two-sample problem. 
We begin by stating Hoeffding’s theorem, somewhat specialised and with slight 
changes of notation: 

Let Z,, --- , Z, be independently, identically distributed chance vectors with 
real components, let s < n and let ¢(Z,,--- , Z,.) be a real valued symmetric 
function of its s arguments such that E[g(Z, , --- , Z.)]’ < «, and let us write 


Ed(Z1,°::,2Z.) = 0. 
Let 


—1 
Ur = (2) 26Gery + Zed 


where the summation extends over all subscripts 1 S a, < --- < a, Sn, and let 


/ 


U, =U, + R,, 
where R,, is a random variable for which 
E(nR;) ~0 as n>. 
Then+/n(U, — 6) is asymptotically normally distributed. Further, if we put 
Via) = Elo(a,Z2,---,Z.) — 4, 


the limiting distribution of ~/n(U,, — @) is nondegenerate provided Ely(Z,)}° > 0. 
We can now state 
THEOREM 3.2. Let Xi,°--, Xm3 Yi,-°°:, Yn be independently distributed 
with c.df.’s F, G respectively. Let t(a,, +++, 2%, Yi, *** 5 Yr) be symmetric in the 
z’s alone and in the y’s alone. Suppose that 


Et(X,,---,X,,¥i,---, Y¥,) = oF, @) = 8, 
E(t(Xi,+--,X,,Vi,+++, YP =M < @. 


Let m/n = c, and let n be sufficiently large so that r < m, n. Define 


-1/.\=1 
U. = ("") jj Zi(Xa, pean Teiy : Ys, yy mesg Ys,), 





168 E. L. LEHMANN 


where the summation is extended over all subscripts 1 S a, <-+--- < a@ S m; 


1S6<-+- <8, Sn. Then,asn— ©,W/n(U, — 0) is asymptotically normally 
distributed. 

Proor. For the sake of simplicity we shall give the proof only in the case 
m = n. Let Z; = (X;, Y;) and define 


=} 
o(Z; 5 Madiy Zor) = i >t(X:, ee X;, : Tas be. Y;,) 


summed over all sets of indices for which (4) < +--+ < t,, 71 <-*+ <j) isa 
permutation of (1,---, 2r). Further let 


—1 
- n . , 
U, = (5) 2O(Z,,,°°* » Dray) 
summed over all y’s such that 1 S y; < +--+ < ye Sn. 


an 9 , 
Clearly (”) U, is the sum of all possible ¢-terms, while (3) is the 


sum of only those ¢-terms in which the X’s and Y’s have no common subscript. 


») = 
Hence, since or.) = " " . , we have 
2r r r r 
: =! —2 
U’, = (” _ ")(") U, + (") W.. 
r r r 


a % n\ fn n\(n—r ; 
where W,, is a sum of — t-terms, and we can write 
r r rT Tr 


UU. = U. + D.,. 


r-[C7)-O) 


“* 9 2 
Since for any real numbers f; , --- , f; we have (t; + --- 2 <= k(t, + 


where 


2 
t.) we see that 


52 —2 / faa\=8 
172 n n—r n n oe "6 
sor s[(")-("2")T (*) e+ [(C)- ("29 CY oe 
But, asn — ©,+/n @ a c z IC) ‘+0. Hence E(nD%.) —0 and the 


result follows. 
Let us now consider the application of this theorem to the Mann-Whitney 
statistic. We define 





NONPARAMETRIC TESTS 169 


Then U, = Vn,, and asymptotic normality follows since E@(X, Y) < 1. It 
remains to check under what conditions Ey’*(Z,) > 0. Since we have s = 2r = 2 


2¥(a1) = P(Y: > 1) + Pty > X2) — 2P(Y > X). 


Hence Ey*(Z) = 0 is equivalent to F(Y) — G(X) = constant with probability 1, 
or P(Y > x) + P(y > X) = constant except on a set of points (z, y) that has 
probability zero. It is easy to see that this is satisfied if and only if P(Y > X) is 
1 or 0. 

So far we have considered the hypothesis H: F = G against the alternatives 
that the Y’s tend to be larger than the X’s. As a second example we shall consider 
testing H, or even the wider hypothesis H’ that F and G differ only in location 
(i.e., that F(z) = G(x + d) for some d), against the alternative that the Y’s 
are more spread out than the X’s (in a sense to be defined below). In analogy 
with the Mann-Whitney test let W,,,, be the proportion of quadruples X, , 
X;, Ye, Y: for which | ¥; — Y¥;| > | X; — X;|. We reject H if W,,,,. is too 
large. This test is unbiased against all alternatives (F, G) for which F(x,) = 
G(y:), F (a2) = G(ye) implies | 7: — z2| < | y: — y2 |. The test is consistent against 
the wider class of alternatives for which P(| Y’ — Y | >|X’ — X|) > 4 
where X, X’, Y, Y’ are independently distributed with c.d.f. F, G, respectively. 
The proof of unbiasedness is quite analogous to the one given previously, and 
we shall therefore omit it. 

We shall however indicate the proof of consistency, and refer in this connection 
to the closely related remarks by Hoeffding [5] on the construction of consistent 
sequences of tests. 

We first state for reference the following trivial 

LemMa 3.1. Let 6 = f(F, G) be a real valued function such that f(F, F) = 4% 
for all (F, F) in a class Cy. Let Tan = tmn(X1, °°: , Xm, Y1,°++, Yn) be 
sequence of real valued statistics such that T,, tends to 6 in probability as min 
(m,n) — ©. Suppose that f(F, G) > 6o(+4) for all (F, G) in a class @, . Then 
the sequence of tests which reject when Tn». — % > Cmn (when | Tan — 
| > C’,.n) 18 consistent for testing H: Co at every fixed level of significance against 
the alternatives @; . 

For proof one need only to notice that a fixed level of significance #0 implies 
that C,..,— 0 (C,,» — 0) as m,n—-> o, 

In the applications we have in mind, E(T,,,,) is usually independent of m 
and n, and is easy to find. On the other hand some work is required to determine 
o°(T'm.n). It is therefore of interest to notice that the evaluation of o°(T'm.») is 
frequently not necessary to prove consistency. To this end we shall state the 
following lemma, which is a generalisation of a theorem of Halmos [6], and 
which follows easily from Theorem 5.1 of [7]. A simple proof will be given in [8]. 

Lemma 3.2 (Lehmann-Scheffé). Let f(F, G) be a real valued function defined 
for all continuous c.df.’s F and G. There exists at most one function tm, such that 
tm.n(X1,°°*, Xm, Yi,°+*, Yu) 18 symmetric in the first m and in the last n 
arguments and is an unbiased estimate of f(F, G) for all continuous (or even ab- 


’ 


eee Ete A ERE LT ARENDT EAE IT me 





170 E. L. LEHMANN 


solutely continuous) c.df.’s F, G. If such a function tm, exisls, (and has finite 
variance), it has among all unbiased estimates of f{(F, G) uniformly smallest variance. 

For the application to be made here we need the slightly stronger statement 
that the conclusion of the Lemma remains valid if fm.n(Xi,---,Xm3VYi,-+--, 
Y,,) is an unbiased estimate of f(F, G) for all continuous c.d.f.’s F, G for which 


PUY’ = r{> |x = X1) > 4. 


This generalization follows immediately from the proof of the Lemma given 
in [8]. 

The proof of consistency of the proposed test is now immediate. For let Wa.» 
be the proportion of quadruples for which the Y’s are further apart than the 
X’s among the independent quadruples X, , X2, Y1, Yo; X3, X4, Ys, Ya; -- 
Then 


E(W.,.) = E(Wan) = P(| Y’ — Y| >| X’-—X]), 


and hence by Lemma 3.2, 
(3.3) o°(Wm.n) S o'(Wm.n)- 
But o°'(Wa.n) obviously tends to zero as m,n — ~. 

We remark finally that the large sample distribution of W,,,,, by Theorem 
3.1, is again normal. Degeneracy occurs only if either F or G are one-point dis- 
tributions. 

As a last problem in this section we shall consider the hypothesis F = G 
against the combined class of alternatives that the Y’s are larger than the X’s 
or more spread out. In such a problem it seems important not only to decide 
whether F and G are equal but, in case the hypothesis is rejected, for which of 
the two possible reasons it is rejected or whether it is rejected for both of them. 
(See in this connection the discussion by Berkson [9]). Thus one is really dealing 
with a multidecision problem. One must decide between 

dy: Accepting the hypothesis H: F = G, 

d,: Rejecting H for the reason that the Y’s are larger than the X’s, 

d,: Rejecting H for the reason that the Y’s are more spread out than the X’s, 

d;: Rejecting H for both reasons. 

It is desired to find a decision procedure under which the probability of taking 
decision dp is 1 — a when F = G while the probability of taking the appropriate 
of the decisions d, , d. , d; when the hypothesis is false tends to 1 as the sample 
sizes tend to infinity. Let us recall the statistics V,,,, and W,,,, introduced in 
connection with the previous problems and let us denote E(V,,,,) and E(W m,n) 
by @ and 7 respectively. One may then accept H when Van» S dan, Wann S 
bm.n, or take one of the remaining three decisions according as to which one of 
the three complementary inequalities holds. The constants a,,,, and b,,,, are not 
completely determined by the equation 


Pivaw uo Gua: Ven Ge Caner = @) = @, 





NONPARAMETRIC TESTS 


One may specify some additional restriction, such as 


P(Vnn S Amn | F = G) = P(Wan S bmn | F = G). 


It is easy to prove that the above procedure has the consistency property 
asked for. This follows from Lemmas 3.1 and 3.2 generalised to the case that the 
function f(F, G) of these lemmas is vector valued instead of real valued. The 
function t,,,, of Lemma 3.2 is then also vector valued and instead of its variance 
one may consider its ellipsoid of concentration (see [10] and Theorem 5.2 of 
[7]). In the present case we notice that (Vm.n, Wm.) is a symmetric estimate of 
(@, ») and hence has a uniformly smallest ellipsoid of concentration. But one 
can easily construct unbiased estimates of 6 and 7» based on independent samples, 
whose concentration ellipsoid has both axes tending to zero as the sample sizes 
increase indefinitely and so consistency can be proved by the device used after 
Lemma 3.2. 

4. Two-sample problem: general class of alternatives. In the present section 
we shall consider the problem of testing the hypothesis F = G against the class 
of all continuous alternatives F ~ G. One might argue that this should not be 
treated as a hypothesis-testing problem. For Berkson’s argument seems to 
apply: The question is not only whether or not the hypothesis is true. If it is 
false, it is necessary to decide what alternative hypothesis is correct. While in 
some situations, this criticism seems to be valid, there are others in which it does 
not seem to apply. 

The two-sample problem may arise in the following two quite different settings. 

A: Two production processes, treatments or populations are available, and 
it is desired to decide whether one is better than the other. In this case the 
populations F and G are in competition, and the main problem is that of rank- 
ing them. Here the notion of such a ranking automatically suggests some 
specific class or classes of alternatives to the hypothesis that the populations 
do not differ. 

B: The two populations coexist. There is no question of which is preferable, 
but we wish to know whether the two can be treated as one. One may, for 
example, want to know whether the output of two different machines can 
be treated as a uniform product, or whether data obtained under two different 
experimental setups or by two different investigators may be pooled. These 
problems really are two-decision problems: The data can or can not be pooled. 
An explanation of why they can not be pooled is not necessarily of interest. 
In connection with the present problem Wald and Wolfowitz [11] proposed as 

test statistic the total number of runs of the ordered z’s and y’s, the hypothesis 
to be rejected if the number of runs is too small. The authors proved their test 
consistent, under the assumption of constant ratio of sample sizes m/n, against 
alternatives of all shapes restricted only by mild assumptions, concerning exist- 
ence and positiveness of the probability densities. It was also proved in their 
paper that the test statistic has an asymptotically normal distribution when the 
hypothesis is true. More recently Wolfowitz [12] proved that the limiting dis- 


; 
: 
i 
i 





172 E. L. LEHMANN 


tribution is normal even when F # G, and obtained the asymptotic variance 
for this case. It follows from his results that the test is in general not consistent 
if m/n — 0 or «. This is actually what one would expect since when m/n is 
sufficiently extreme the maximum number of runs will in general occur with 
near-certainty whether the hypothesis is true or false. 

Another test suitable for this problem is that of Smirnov [13] based on the 
maximum difference between the two sample cumulative distribution functions. 
For the given samples X,,--- , X,; Yi,-°-:, Yn let 


oalt) = 6(Xi,°°-, Xn; = < (number of X’s s 2), 


¥.) = ¥(M,-°--, Y.;0 = - (number of Y’s S 2), 
7 


be the two sample cumulative distribution functions. It follows from a theorem 
of Glivenko-Cantelli [14], that sup,|¢,(t) — F(t)| and sup|y,(t) — G(d) | 
tend to zero in probability as min(m, n) — «. From this it is easily seen that 
sup| ¢,(t) — ¥,(t) | is a consistent estimate of sup] F(t) — G(t)|, and hence 
that Smirnov’s test is consistent against all alternatives F ~ G as min(m, n) > 
co. A different proof of this fact was given recently by Massey [25]. 

The large sample distribution of sup| ¢,,(t) — W,(t) | was obtained by Smirnov, 
for the case that F = G, a simpler proof having recently been given by Feller 
{15] (see also Doob [16] and Smirnov [17]}). Although the large sample distribu- 
tion is not known when F # G, Massey [25] obtained a lower bound for the 
power of Smirnov’s test, which may permit comparing this test with others. 

While these two generally consistent tests are known for the two-sample 
problem, very little work has been done on the existence of unbiased tests for this 
or other nonparametric problems. Mann [18] proved unbiasedness of a test for 
randomness against a certain class of trends. Hoeffding [5] proved the non- 
existence for the hypothesis of independence of unbiased critical regions based 
on ranks, corresponding to certain very small levels of significance. 

As far as the two-sample problem is concerned, Smirnov’s test is easily shown 
to be biased on the basis of an example given by Massey for the problem of 
goodness of fit. On the other hand, it seems very possible that the Wald-Wolfo- 
witz run test is unbiased whenever the two samples are of equal size. We have 
not proved this but shall now construct a test for the two sample problem that 
is strictly unbiased. 

Lemma 4.1 Let X, X’; Y, Y’ be independently drawn from populations with 
continuous cumulatives F, G respectively, and let us denote for any random variables 
U, U’; V, V’ the event max (U, U’) < min (V, V’) by U, U’ < V, V’. Then 


Pp _ P(X, _ , ee Y’) + r. y’ “ Z. m=?) - } & 2 / (F a ara(* + 2, 


and hence p attains its minimum value § if and only if F = G. 





NONPARAMETRIC TESTS 


Proor. Since F and G are continuous, 


p=fa-niae+fa-@tar =2+ f Pae+ oar) 


4 | FGd(F +G) =2+ [ awe’) be 4 | roar + @) 
3—2/(r+a*- - aa(F +9) 


s+2/- G)? a(F 49), 
To prove the second part of the lemma, we must show that A =[ — G)’ 
d(F + G) = 0 implies F = G. Now A = O implies F(x) = G(x) except 
possibly on a set N such that [ dF = [ dG = 1. Suppose that F(2,) # G(2), 
N N 


G(a,) — F(a) = » > O say. Then by continuity there exists m < 2; such 
that G(z) =. F(a) + 7/2 and F(z) < G(x) for m S x S x. Since 
G(x) — G(a) > 0, it follows that A > 0. 

It is now clear that there exists a strictly unbiased test of H:F = Gif m,n 2 
2. For we can consider the number of quadruples X21 , Xo: ; Yei-s , Yo: for which 
either the two X’s fall below the two Y’s or vice versa. These may be regarded 
as the successes in independent trials with probability p = 4 + 2A of success, 
and the problem reduces to that of testing H:p = 4 against alternatives p > }. 

The unbiased test just described has the pleasant property that its power is a 
strictly increasing function of A = / (F — G)d J : :. which seems a reason- 
able measure of the degree of difference of F and © On the other hand one 
would not expect this test to be very efficient. More reasonable use of the data 
seems to be made if one modifies the test in the direction of the Mann-Whitney 
test described earlier. One would then compare each pair of X’s with each pair 


of Y’s, and reject H if among the ae) possible quadruples X;, X; ; Yk, Yi: 


it happened too frequently that both X’s lie on the same side of both Y’s. 

This test is no longer unbiased, but it is still consistent as follows from the 
argument given in the previous section. Further, the test retains the property 
that the statistic on which it is based provides an unbiased estimate, in fact 


a : ; é . 2,F+G 
the minimum variance unbiased estimate, of the quantity [oe — G)d . 4 
Finally, it is again easily seen that the distribution of the test statistic is ap- 
proximately normal, degeneracy occurring only if P(Y > X) equals 1 or 0. 
The test can be expressed in a form more convenient for computation in terms 
of the ranks of one of the sets of variables. Let r, < rz < --- <r, be the ranks 





174 E. L. LEHMANN 


of the n Y’s among the totality of m + n observations, and denote by Qm,» the 
number of quadruples X,;, X;, ¥%, Y: for which both X’s lie on the same side 
of both Y’s. Then it is easily seen that 


Qa. = > | — k) (” . *) , - . + ‘I. 


From this it follows by easy computation that 
2Qm.n= (n — 1) Do ri — 2(n + m — 2)Ekre — (n — 2m + 1TH 
k=1 


n(n + 1)(2n + 1) 


+ (n + 2m — 3) 5 


Lt (np + m® — 3m + 1) = 1) 

It may perhaps be worth noting that the first of the two tests described in 
this section can also be used as the basis of a sequential test of the two sample 
problem. This is clear since the problem is simply reduced to that of testing a 
simple binomial situation against a one-sided class of alternatives. The sequential 
probability ratio test to which one is led in this manner of course is again un- 


ee. 


biased and has a power function that is strictly increasing in Ic F — Gy’ 


The measure of discrepancy 


[@ - atte Ho 


utilised in the tests of the present section, suggests using |. —y,)d (e+ ¥n bs) 


as test statistic. It should be pointed out that tests of this kind have been studied 
in connection with the closely related problem of goodness of fit by Cramér 
[19] and von Mises S i 20). In the present case, let us de note the x’s and y’s in order 
of magnitude by 2? <7 <--) c's y™ < y® < --- < y™, let m be the 
number of z’s < y"’, m2 the number of z’s between y"” and y ete., and define 
Mm, M2, °*+ analogously. Then it is easily seen that 


[on = 0)? dln + ve) 


y+ (Ste 1 és ‘ee | 
5 m m 
+ L |(™ a L . 
m n m 


Tests of this type have been proposed by Dixon and by Mood [21], but have 
not been studied thoroughly. 

Finally it should be mentioned that one might also try the method of ran- 
domisation, which has been considered by Pitman [22] and others in connection 





NONPARAMETRIC TESTS 175 


with specific classes of alternatives, for the present problem. One statistic which 
n 


may be suitable for this purpose if m = nis >. (Y°? — X")?. 


a=] 

5. Discontinuous distributions. So far, we have assumed F and G to be con- 
tinuous. This assumption is obviously not satisfied in practice, and we must 
therefore consider the difficulties introduced by discontinuities. (These diffi- 
culties were investigated in connection with various estimation problems by 
Scheffé and Tukey [23}). 

Let us restrict our attention to rank tests and introduce the convention that 
tied observations are ordered at random. Thus if X;, = --- = X;, = Y;, = 

= Y;,,s + t =r, we perform an experiment with r! possible and equally 
probable outcomes. We then establish a 1:1 correspondence between the r! 
possible orderings of r objects and these r! outcomes, and treat the X’s and Y’s 
as if they had occurred in the order indicated by this experiment. If the X’s 
and Y’s have the same distribution it is then clear that the distribution of any 
rank statistic of the X’s and Y’s is what it would be if this common distribution 
were continuous since in both cases each possible ordering of the X’s and Y’s is 
again equally probable. 

In order to see that various unbiasedness results of the preceding and follow- 
ing sections remain valid, we state the following 

Lema 5.1. Let & = &(Xi,°:-, Xm, Yi,-°-*, Yu) be a random event de- 
pending only on the ranking of the X’s and Y’s. Stippose that F and G may have 
discontinuities and that in case of ties the event & is defined by ordering the tied 
observations at random. Then there exist continuous c.df.’s F* = F*(F, G) and 
G* = G*(F, G) such that 


Py a(8) = P pe gol &). 


and that F* = G* if and only if F = G. 

Proor. We shall only give the construction of F*, G*; the remainder of the 
proof then follows easily. 

Consider the (denumerable) totality of points that are points of discontinuity 
of either F or G, and suppose these points have been numbered: 2 , 22, --- 
Consider first 2, and define two new c.d.f.’s F; , G; as follows: 


F\(z) = F(a +3) if 
1 — 3) ~ ‘ 
*° (F(a) — F(ay)| if 


=F(ir—3) ff tr >m+h. 


G, is defined analogously in terms of G. What this construction does is to push 
F and G apart at 7 symmetrically by a total amount of 4, and to distribute 
the probability at x, uniformly over the gap thus created. 

In the same way we now push F; and G; apart at the second discontinuity 
(in its new position) by a total amount of 1/2? and distribute the amount of jump 


ie REDO tb 2A TN AOR NEE TE Matt Ry 





176 E. L. LEHMANN 


uniformly over the gap, thus obtaining F, and G,. Then the sequence F, 
F,, +--+ will converge to a continuous distribution F* and analogously for the 
G’s and F*, G* will have the desired properties. 

It follows from this lemma that the unbiased test of the hypothesis F = G 
discussed in Section 4 remains strictly unbiased when the assumption of con- 
tinuity of F and G is dropped. On the other hand, the power is no longer such a 
simple function of F and G. In fact let XY, X’; Y, Y’ denote as before independent 
random variables with distributions F and G respectively and denote by X, X’ < 
Y, Y’ that this ordering occurred after randomisation of ties. Then it is not 
difficult to show that 


P(X, X’ < Y, Y’) + (Y, ¥* < X, x’ 
where 
34’ = KG - @'+ 0 - G+ - OF - aja 2. 


9 


Here F (2) = F(x ), G (x) = G(x 

6. Existence of unbiased tests for the hypothesis of independence and some 
other nonparametric problems. In this last section we shall briefly consider 
some more complicated nonparametric problems. Our aim is to prove for all 
these problems the existence of strictly unbiased and consistent tests. The 
problem is treated purely theoretically in that no effort is made to construct 
tests that make good use of the data and that are convenient to apply, but that 
instead the sole purpose is to exhibit tests possessing the properties asked for. 

For the hypothesis of independence Hoeffding proposed a test that he proved 
consistent against all alternatives with continuous joint and marginal prob- 
ability densities. In this connection he also considered the problem of unbiased- 
ness and proved the nonexistence of unbiased critical regions based on rank for 
certain small levels of significance. This negative result seems to contradict 
those of the present section. This is however not so. Hoeffding restricted his 
attention to critical regions while we are here admitting also randomised tests. 
It should be pointed out in this connection that, while randomisation was used 
in previous sections only in a trivial manner, namely so as to get the exact 
level of significance, we shall here make very heavy use of this device. This 
could be avoided in part, however the tests would then become more compli- 
cated. Further if the problem is reduced, as is done here, to that of testing 
equality of two binomial p’s, randomisation is needed to get an exactly similar 
test. 

The hypothesis of independence states that the joint c.d.f. equals the product 
of the two marginal c.d.f.’s. Thus if (X{”, X), i = 1, 2, --- , are independently 
drawn from a bivariate distribution F, it is equivalent to the hypothesis that 
the pair (X{”, X{?) comes from the same bivariate population as the pair 
(x$”, X{). It is therefore clear that if we can prove the existence of strictly 
unbiased and consistent tests for the bivariate two-sample problem, this will 





NONPARAMETRIC TESTS 177 


imply the existence of tests with these properties for the hypothesis of inde- 
pendence. The same remark clearly applies to hypothesis of independence (both 


complete independence and independence of sets of variates) in more than two 
variables. 


Consider now samples X; = (X$”,---, X$°) i = 1, 2,--- and Y; = 
(Y$?,---, YS’) j = 1, 2,---+ from two k-variate distributions F and G. The 
work of section 4 suggests utilising the expression 


[@-@ af +9) . [e + @) (E+ + ¢) _ fro a Ft) 


All that is necessary is to construct events A and B such that 


+2 y2 ‘ul ’ 
9 = P(A) = [- or (Et <a 
oo oP | ra a +9) 


_ 


The hypothesis H:F = G willthen be reduced to H’:p; = p, to be tested against 
alternatives p; > p.. The events A and B may be defined as follows: 
A: With probability 4 observe either X; , X2 or Y; , Y2 and with proba- 
bility 4 observe either X; or Y; . Denote the three variables that are observed 
by Z,, Z., Z;, and define A as the event 


(4) (4) () 
Zi,23 323 


B: Observe X,, Y, and with probability 3 either X; or Y5. If the last of these 
variables is denoted by Z; , define B as the event 


xf, yi" < Z° 


It should be mentioned that instead of observing five random vectors some of 
which may be either X’s or Y’s, we could have obtained a test with the desired 
property based on ten observations, five X’s and five Y’s. 

To complete the proof we must show that the hypotheses H and H’ are really 
equivalent, that is, that p: = p, if and only if F = G. For the case that F and 
G are continuous this follows immediately by an argument similar to the one 
given in the univariate case, and it is easy to show it even without 
this restriction. 


It is clear that one can generalise further and instead of two samples consider 


s samples. For this purpose one may replace [ (F — Gy a * °) for example 


by >. (F; — PF)’ dF where F is the average of the s c.d.f.’s. Alternatively, one 
t—1 


may utilise the expression >> [e: — F;)* {eet Fe + =) 
<j 


As a last problem let us consider a sample X,,--- , X, from an unknown 





178 E. L. LEHMANN 


univariate c.d.f. F, assumed to be continuous. It is desired to test the hypothesis 
H of symmetry with respect to the origin, i.e., that F(x) = 1 — F(—2) for all z. 
Smirnov [24] recently proposed max, | | N“(r) — N (x)|} as a test statistic 
where N(x), N (x) denote the number of x’s contained in the intervals (0, z), 
(—-x, 0) respectively. 
The work of Section 4 suggests considering 4 X’s (X;, X,;, X;, Xu.) and de- 
fining the following two events. 
A: Exactly two of the four X’s are positive. 
B: If A is satisfied, and X;, X¥; < 0 < X,, NX,, say, the event B is said 
to occur if neither 


Xs] ,|Xs| < Xe, X, nor Xe, Xi < | Xe|, | X; 


Then if F(0) = pm, P(A) = 6poqo takes on its maximum value 3/8 if and only 
if po =1/2. Further, P(B | A) takes on its maximum value 2/3 if and only if the 
conditional distribution F* of —X given X < 0 is the same as that, G*, of X 
given X > 0. Thus 


ein a 7, * G* ) 
P(AB) = 6po qo\3 —2 [ce = G*)’ a(* aes =) 


> 


takes on its maximum value 1/4 if and only if the hypothesis of symmetry 
holds. 


If we apply this method to independent quadruples, we obtain a test that is 


strictly unbiased and consistent. If we apply it to all possible quadruples the 
test remains consistent and may be a reasonable test for the hypothesis in 
question. Hoeffding’s theory can again be applied to the asymptotic distribu- 
tion problem. 


REFERENCES 


{1] H. C. Maruisen, “‘A method of testing the hypothesis that two samples are from the 
same population,’’ Annals of Math. Stat., Vol. 14 (1943), pp. 188-194. 
HorrrpinG, ‘“‘A class of statistics with asymptotically normal distributions,”’ 
Annals of Math. Stat., Vol. 19 (1948), pp. 293-325. 
. B. Mann anv D. R. Wuirney, “On a test of whether one of two random variables 
is stochastically larger than the other,’’ Annals .of Math. Stat., Vol. 18 (1947), 
pp. 50-60. 
. Tuompson, ‘Biological applications of normal range and associated significance 
tests in ignorance of original distribution forms,’’ Annals of Math. Stat., Vol. 9 
(1938), pp. 281-287. 
HoerrpinG, ‘“‘A non-parametric test of independence,’’ Annals of Math. Stat., Vol. 
19 (1948), pp 546-557. 
. R. Hatmos, ‘“‘The theory of unbiased estimation,’’ Annals of Math. Stat., Vol. 17 
(1946), pp. 34-43. 
Ic. L. LEHMANN AND H. Scuerr®&, “Completeness, similar regions 
tion—Part I,’’ Sankhyd, Vol. 10 (1950), pp. 305-340 
| EK. L. LEHMANN AND H. Scuerr®, ‘‘Completeness, similar regions, : 
tion, Part II,’’ unpublished. 


, and unbiased estima- 


nd unbiased estima- 





NONPARAMETRIC TESTS 179 


(9] J. Berxson, ‘‘Comments on Dr. Madow’s ‘Note on tests of departure from normality’ 
with some remarks concerning tests of significance,’ Jour. Am. Stat. Assn., Vol. 
36 (1941), p. 539. 
[10] H. Cram&r, “Contributions to the theory of statistical estimation,” Skandinavisk 
Aktuarietidskrift, Vol. 29 (1946), p. 85. 
{11} A. Waup anv J. Wo.row17z, “On a test whether two samples are from the same popula- 
tion,” Annals of Math. Stat., Vol. 11 (1940), pp. 147-162. 
[12] J. Wo.rowitz, ‘‘Non-parametric statistical inference,’’ Proceedings of the Berkeley 
Symposium on Mathematical Statistics and Probability, University of California 
Press, Berkeley, 1949. 
[13] N. V. Smirnov, “On the estimation of the discrepancy between empirical curves of 
distribution for two independent samples,” Bull. Math. Univ. Moscow, Serie Int., 
Vol. 2 (1939). 
{14) M. Fr&cuet, Recherches Théoriques Modernes sur la Théorie des Probabilités, Vol. 1, 
p. 260, Gauthier-Villars, Paris, 1937. 
). Fevier, “On the Kolmogorov-Smirnov limit theorems for empirical distributions,” 
Annals of Math. Stat., Vol. 19 (1948), pp. 177-189. 
J. L. Doon, “‘Heuristic approach to the Kolmogorov-Smirnov theorems,’’ Annals of 
Math. Stat., Vol. 20 (1949), pp. 393-403. 
N. V. Smtrnov, ‘Approximate laws of distribution of random variables from empirical 
data,’’ Uspehi Matem. Nauk, Vol. 10 (1944), p. 179. 
. B. Mann, “Non-parametric tests against trend,’’ Econometrica, Vol. 13 (1945), 
pp. 245-259. 
. Cram&r, “On the composition of elementary errors,’’ Skandinavisk Aktuarietid- 
skrift, Vol. 11 (1928), p. 13 and p. 141. 
. von Mises, Wahrscheinlichkeitsrechnung und ihre Anwendung in der Statistik und 
theoretischen Physik, Franz Deuticke, Leipzig, 1931. 
). Dixon, ‘‘A criterion for testing the hypothesis that two samples are from the same 
population,’’ Annals of Math Stat., Vol. 11 (1940), pp. 199-204. 
E. J. G. Prrman, “Significance tests which may be applied to samples from any popula- 
tions,’’ Jour. Roy. Stat. Soc., Suppl., Vol. 4 (1937), pp. 225-232. 
[23] H. Scuerré anv J. W. Tuxey, ‘‘Non-parametric estimation. I. Validation of order 
statistics,’’ Annals of Math. Stat., Vol. 16 (1945), pp. 187-192. 
N. V. Smirnov, “Sur un critére de symetrie de la loi de distribution d’une variable 
aléatoire,”’ C. R. Acad. Sci. URSS, Vol. 56 (1947), p. 11. 
[25] F. J. Massry, Jr., ‘‘A note on the power of a non-parametric test,’’ Annals of Math. 
Stat., Vol. 21 (1950), pp. 440-443. 
[26] H. R. VAN pER Vaart, ‘‘Some remarks on the power function of Wilcoxon’s test for 
the problem of two samples. I,”’ Indagationes Math., Vol. 12 (1950), pp. 146-158. 


ee Aone PAY RRR a Serta NRE 





A SIGNIFICANCE TEST FOR EXPONENTIAL REGRESSION 
By E. 8. Kreepinc 
University of Alberta' 


1. Summary. A general method of testing the significance of nonlinear re- 
gression, suggested by Hotelling, is adapted to the regression equations Y = 
be” and Y = a + be”. The values of x are taken to be in arithmetic progression, 
and the standard deviation of the observed y is supposed constant for all z. 
This is in contrast to the assumption, implicit in the usual procedure of fitting 
a straight line to log y, that the standard deviation of log y is constant. 

It will be observed that the distribution of y:, yz, ---, yx must be such that 
the joint probability density for y: , ys, --- , ya is. a function of 2] + 23 +--+ + 
x*,, and this condition implies the assumption of normality. The null hypoth- 
esis is that be” = 0 for all x, while the alternative hypotheses are specified by 
b6#0,p #= —~. 

The method involves the calculation of the volume of a “tube” on a hyper- 
sphere in n-dimensional space. An asymptotic expression for the length of the 
tube is developed, and it is shown that the curvature of the axis is everywhere 
finite. From this expression, for values of the correlation coefficient R between 
observed and fitted values of y at least as great as 0.894, a function of R is 
obtained giving the probability that a random sample would yield at least as 
great a value of R. 

A short table giving R for various significance levels and various sizes of 
sample is calculated for each of the equations mentioned, and the application 
to certain experimental data is discussed. 


2. Introduction. Some years ago, Hotelling [1] suggested a geometrical method 
of determining the significance of the correlation coefficient corresponding to a 
fitted regression of y upon x, when y is a random variable and the values of x 
are known. Suppose that a curve of the form 


(2.1) Y = Of(z, p), 


where b, p are constants and f(x, p) is not identically zero, is fitted to a set of 
observations y1, Y2,°°*, Yn, Which are assumed to be independently and 
normally distributed about zero, with the same variance o. The null hypothesis 
is that b = 0, while the alternative hypothesis is that 6 is not zero. By the 
principle of least squares, we minimize 


(2.2) Z(Ye — Ya) = 2 [ye — Of(Za, pl’. 


The set of values y;,--- , yn defines a point in n-dimensional space. The set 


' Research carried out at the Institute of Statistics, University of North Carolina 


180 





EXPONENTIAL REGRESSION 181 


Y,,-+-+-, Y, also defines a point which lies on the 2-dimensional hypersurface 
defined parametrically in terms of b, p by the n equations: 


(2.3) Y. = bf(te, p), 2, --+ ym. 


If @ is the angle between the lines joining the origin to (y:,---, y,) and 
(Fined: 


(2.4) cos 6 = 2 ya¥./[Z yi, = Y2), 


and this is the correlation coefficient 2 between the observed and fitted values, 
valculated without elimination of the mean. The least squares process is thus 
equivalent to maximizing R, or minimizing 0. 

Since by the null hypothesis the point (y;,--- , y.) lies with uniform prob- 
ability density anywhere on the surface of a sphere whose centre is the origin, 
the density function of the projection of the point (y:,-°--, yn) on the unit 
hypersphere has complete spherical symmetry around the origin. This will 
also be true if the joint probability density for y:, ye, --- , Yn is any function 
of xn a a + eee + x, , and so is constant on the hypersphere x a 
1. The probability that # is greater than some fixed value Rp is therefore, for a 
given Y, proportional to the “volume” of the sphere, in the (n — 1)-dimen- 
sional spherical space, having centre Y and geodesic radius % = cos Ry. The 
total probability that R lies between Ry and 1 is therefore given by the ratio 
of the “volume” of the “tube” of geodesic radius 6), surrounding the curve 
formed by the projection of Y, to the total ‘‘area”’ of the unit hypersphere. 

Hotelling [1] has shown that the volume of such a tube on a hypersphere in 
n-dimensional space is equal to the length of the curve multiplied by 


nr" sin”? 6/T'(n/2), 


provided that the curve is closed, and nowhere has a radius of geodesic curva- 
ture less than sin 6 , and provided also that portions of the tube corresponding 
to nonconsecutive ares of the axial curve do not overlap. If the curve has ends, 
there will be hemispherical “caps” at the ends of the tube to be added to the 
total volume. 

This geometrical method was applied by D. M. Starkey [2] to the case of 
periodogram analysis, in which there are additional parameters, so that the 
projection of Y is not a curve but a surface. The practical application of the 
method in this case is limited to quite small values of 6 , because of the ap- 
proximations necessary in the evaluation of the integrals involved. 

The 3-parameter equation 


2.5) Y, = a+ Of(ta, p) 


readily reducible, theoretically, to the form treated above. Minimizing 
[ya — a — Of(za, p)|’ is equivalent to minimizing = (y — Y.)’, where 
Ya, Ya are the projections of y., Yq. on the hyperplane = y. = 0. Since y, = 
“va — Gand Y, = b(f. —f), where f, stands for f(z2, p), the angle @ between 





182 E. S. KEEPING 


the lines joining the origin to (yi, y2, °-* , ya) and (“ee Y.,) is given 
by 
(2.6) cos @ = 2 (ya — 9)(Ya — Y)/[Z (ya — 9) UD (Va — yy} 


and so is equal to the correlation coefficient R between observed and fitted 
values, calculated in the usual way with elimination of the means. The point 
(Y:, Y2,---, Y%) lies on a 2-dimensional projection of the 3-dimensional 
hypersurface defined parametrically by (2.5). If we now project from the origin 
on to a hypersphere of n — 2 dimensions (intrinsically) in the hyperplane 
= ya = 0, the projection of (Yi, Y:,---, ¥4) will bea point S's Be ght 
Y”.) lying on a curve on the surface of this hypersphere. The method already 
given therefore applies in this case, with the appropriate change in the dimen- 
sionality of the hypersphere. 

The present paper deals with the application to exponential regression. The 
curves to be fitted are 


(2.7) Y = be” 

and 

(2.8) Y =a+ be”, 

the latter of which will be referred to as the “modified exponential equation.” 

The mathematical difficulties are increased greatly by the additional constant. 
3. Formulas for projections of regression curves. We suppose that the fixed 

values of x are equidistant, and choose units so that 

(3.1) m™ = i, Xe = 2, 

The corresponding values of Y are 

(3.2) Y, = bq, Y, = bq’, tee Y, = bq”, 

where q 2”. If the projections on the unit hypersphere are denoted by 

(3.3) Y. = dq", 

Hence 

(3.4) Ww =q (1 — 


and 


(3.5) 


The element ds of the curve formed by Y’ (0 < < ©) is given by 


(ds)? = S(dY.,) 
(3.6) 


l-¢ ff Fe ii 
= - hind > \a—-1+ mins 2 > ¢” “(de)”. 
l1_-— CS” an) —_— = q } 





EXPONENTIAL REGRESSION 


This reduces, after some algebraic manipulation, to 
. ds “1 x nz” | 
(3. = (2 — = - —— |; 
. a7 la =a a =| 


where x = q. The length of the projected curve is obtained by integrating from 
Oto «. 


For the modified exponential equation, we have, instead of (3.3), 
(3.8) Y. = X(q" —f/n), 
where 
(3.9) f==rq=ql-—gq)l—®@. 
Since = (Y%)’ = 1, we obtain 
= (9 —f/ny4, 
where 
(3.10) g=q(l—-—¢”")/(l —@q). 
The expression for (ds/dq)” = = (dY%/ dq)’ reduces after lengthy algebra to 


9 


nq” 1 l-—q 


( 
| 
(3.11) (#) 1 _j} r-¢ n (— 9? 
vo. —y \ ™ 7 a — 


dq a — ¢@)? a Lite n | 
i+ q n i—¢" q)| 


, 


whence s may be obtained by integration. 


4. Lengths of the projected curves. From (3.7) 


l, [ [x — 2)? — nite" — 2°)) dz /(22) 
0 


(4.1) 


el 


| [x(Ql — 2)? — n2e"(1 — 2") de/z, 
0 


since the substitution x = 1/y leaves the integral unchanged. 

For n = 2 the integral is elementary and reduces to 7/2. For n = 3 it can 
be expressed in terms of elliptic integrals of the first and third kinds. For higher 
values of n a convergent series can be obtained, which, however, converges 
very slowly for n larger than 5. 

Putting x = (l—wu/1l+wuw,g=1- u’, we obtain 


I ° [ ? —u a n°(1 _ u’)" ] : du 
P42 i ally 4u? (1+ u)*— (1 —u)}?] 1 — wv 


n2gn-l 4 


confines rraeiremeccemincerinn Tis 


— (2n\ , a — (2n ‘ 
i) — 29° + i t (—u) 





deere ret ote 


ae a ARE eA EYE OF 


sepmnsr. 


Ds 


OSLO LEA ORLLLL ALLA ABLE LEE AAI RE 





184 E. S. KEEPING 


and this integral may be shown to exist for every finite n. For n = 3, we have 
I; = w/2[1 + 1/4 + 9/256 + 5/512 + 385/262144 + ---] 
= 2.037. 


For n = 4, a similar method gives , = 2.35, and for n = 5 we obtain I; = 2.58. 
However, as n increases, the method is more laborious and the integrand does 
not converge so rapidly, so that an asymptotic expression for 1, is more con- 
venient. 

2 


For the case n = 3, (3.11) reduces to 


ds V3 


9 


dq 


, rr , * . . . . . . 
whence J; = 2/3. That this is correct may be seen by visualizing the projection 
of the regression curve on to a circle in the plane through the origin which is 


equally inclined to all three axes. Writing u = tanh p/2, we obtain 


where 


(1+ u)* — (1 — u)" 


+ [ele +t 1] 


It may be shown that the denominator of ¢ never vanishes and that g¢ < 1 in 
the region of integration. 
When n 4, the integral (4.3) is elliptic, and we find , = 1.418. 


ion eg hit 6 
25 — 5u + 15u — 3u 


When n= 5, 5 =, 
; ? 95 + 65u? + 35u* + Bue 


nr . nr . ° / 
The integral may be evaluated by quadrature. The numerical value is l; = 
1.675. For larger values of n, an approximation is obtained in the next section. 


5. Approximations to the length. Putting x = e«~**, we have from (4.1) 


ons 1 at $ 
(5.1) l, = | (sss sara) ai 


For values of v in the range 0 to 1/n we may write the integrand as a series 
and integrate term by term. Thus, if 


l/n 2 3 
1 n , 
= [ (. > aa dv, 
0 sinh? v sinh? nv 





EXPONENTIAL REGRESSION 


we obtain 


(5.2) = 0.559 — 0.298n~ — 0.0594n* — --- 


ws l n? i 
fi eee 
iin \sinh?v — sinh? nv 


en7iin 2,.2n—2/, _ ,.2\2718 
2 | = I E — nu" (1 — wu)’ du. 
0 l— uv (1 — «®)* 


The second term in the square bracket is less than 1 at both ends of the range. 
Also, it has no maximum within the range. Hence the bracket may be expanded 
and integrated term by term, giving 


em tin 277 _ ,.2)2 2n—2 ee 4, 4n—4 
= 2 1 if _n(l— w)u _n(l— wu)'u =o | du 
0 l— 2(1 — u**)® 8(1 — u**)* 
= In + Ine + Ing + ++, 


where 


Let 


(5.3) 


lin 


du 1 1 
i=o" =" + a ae 


e iin 


—n’ [ ur — wl — w")?* du 
0 


((u* — wi") + (Cu — u™) + B(u"? — ul") + +++) du, 


on integrating, expanding the exponentials, and collecting terms, becomes 
Inn = —[(3/2)e* + (5/4)e* + (7/6)e* + (9/8)e* +--+] 
— 1/n*[(19/24)e* + (71/192)e* + (61/216)e™* + (379/1536)e* 
+ -+-] + O(n“) 
= —0.229 — 0.114n* + O(n). 


Similarly, 


4 e~iln 
n 


oa. Cte 4n—4 as ann a ans 
(5.6) Ina -F wa — WO tan 
= —0.029 — 0.029n~ + O(n). 


Later terms in 7, can be computed in a similar way, but the numerical factors 
diminish rapidly. Collecting terms from (5.2), (5.4), (5.5), (5.6), we get finally 


(5.7) l, = logn + 0.990 — 0.358n~* + O(n“). 





186 E. 8S. KEEPING 


As an indication of the accuracy of this approximation, the value of 1, , neg- 
lecting terms of order n *, has been calculated in Table I for several values of n. 
It is not, of course, to be expected that the approximation will be very good 
for small n, although it is actually quite close even for n = 3 and n = 4. 


TABLE I 


Length of axis of tube (Y = be’) 


Asymptotic value of 1, Exact value of /, 


.o9 
.05 
.39 
.59 


a 


~fé 
93 
.06 
18 
.29 
47 
70 
.98 
90 
. 60 


_ 


t 


~ 


t 


bo tb bo 


WwW Ww WwW 


12 
15 
20 
50 
LOO 


w 


4 
5 


For the modified exponential equation, the above method is apparently not 
practicable with the more complicated integral (4.3). However, it is possible 
to obtain an approximate expression which will be valid for large n, although 
the agreement with the numerical values for n = 3, 4, and 5 is not very close. 

In terms of v = p/2, (4.3) may be written 


j : ie (1 — n? sinh’ v(sinh nv)~)* }! 
2(sinh 2v)™ E ; ( ww) | 


nn dv. 
(n tanh v(tanh nv). — 1 

For values of v between 0 and 1/n, the integrand may be expanded in powers 
of v and integrated term by term. The result is 


(5.9) IT, = 0.500 — 1.08n°° + O(n 


For any fixed u between 0 and 1, the function ¢ in (4.4) tends to (nu — 1)! 
asn— «©, so that the integrand in (4.3) tends to w [1 — (nu — 1)°}'. However, 
this approximation is clearly not useful for uw near 1/n. But if we put u = k/n, 
where /: is a fixed integer, then, for large n, ¢ tends to the value 


, 


k = 27k inont 
e—e —4k(e —e-) 


(5.10) at ye 
(oD J (k— Det + (k + 1) 





EXPONENTIAL REGRESSION 187 


For fairly large k, this is very close to (k — 1)”. Thus for k = 7, it is 0.1667. 
Hence from u = 7/n to 1 we can approximate g asymptotically by (nu — 1)", 
and obtain 


1 we 
I, ~ | (1 — (nu — 1)7P ut" du = [ [1 — sech w] dw, 

T/n wi 
where nu — 1 = cosh w and w,, w2 are the values of w corresponding to u = 
7/n and u = 1 respectively. Hence 
11) I, ~ w, — w, — tan™ sinh w, + tan™ sinh w, 
5.11 : 

= —1.952 + logn + (1/4)n™ + O(n) (n> 7). 


It remains to integrate between tanh (1/n) and 7/n. From 1/n to 7/n, an 
approximation may be obtained by quadrature, the ordinates being calculated 
from (5.10) for values of k between 1 and 7 inclusive. This gives, by Simpson’s 
rule, a value 1.573. A small correction may be made for the integral between 
tanh (1/n) and. 1/n. Since tanh (1/n) = 1/n — (1/3)n™*, approximately, and 
since for u = 1/n, wu '{1 — ¢]} = 0.475n approximately, this integral will be 
0.158n~*, neglecting terms of higher order. 

Hence the final expression for the length of the axis of the tube is 


(5.12) U', = log n + 0.121 — 0.67n~* + O(n). 


_— . . . / ° -3 
lable II gives a few numerical values of 1, , neglecting O(n *). 


TABLE II 
Length of axis of tube (Y = a + be”) 


Asymptotic value Correct value 
of I, of U', 
.14 1.05 
.47 1.42 
.70 1.68 
.89 
05 


_ 





WONNNNN EE 


20 | 


6. Curvature of the projected curve. It was shown by Hotelling [1] that to 
avoid difficulties connected with local overlapping, or “kinking”, of the tube 
surrounding the projected curve, it is necessary and sufficient that sin @ < p, 





188 E. S. KEEPING 


where @ is the geodesic radius of the tube and p the radius of geodesic curvature 
of its axis. In this section we show that the radius of curvature is always finite 
and greater than 1/+/5 = 0.447. The statement by Hotelling (loc. cit., p. 452), 
that the radius of curvature of the projected curve corresponding to Y = be” 
becomes zero at p = +, appears to be in error. 
The radius of curvature p with which we are dealing is defined by 
2 . 2yr/ 2,2 
p =Z(dY,/ ds) 
. f wy Py 2\2 2 2\26 yt 2 
(6.1) = (ds/dp) *{(ds/dp)=(d°¥ ‘ dp) + (ds/dp)=(dY q/dp) 


— 2ds/dp-d's/dp’2(dY/dp-d'Y 4/dp’)|. 


>(dY,, dp) = (ds/dp)’, D(dY,, dp-@Y, dp’) = ds dp-d's dp’, 
this reduces to 


(6.2) p = (ds dp) (ZY. dp)” — (d's dp’)’}. 


xp 


In the present problem, Y, = e*”, so that 


PY .,/dp’ = e*(d’d/dp’ + 2a dd/dp + a’), 
where 


NK? = Der? = (2? — 1)(e” — 1) 


After some reduction we obtain 


a Y., dp’) f 5 l J =| 


63 16 Lsinh? p sinh? np 
6.3) 


8 sinh‘ p sinh‘ np 


iy 2 4 9 ai 2 
- 1 1? + 2sinh p _ ni! 3 + 2 sinh ne 


From equation (3.7) we have, in terms of p = log g as parameter, 
(6.4) ds/dp = 3{(sinh p)~ — n‘(sinh np) ‘}, 
whence 

(6.5) d’s/dp -ds/dp = 4{—cosh p(sinh p) ~ + n° cosh np(sinh np) 
Therefore, from (6.2) 


ve 6 Latent? . ant 2 —2 
ta 8+ | }+4sinh p _ nf 6 + 4 sinh || 1 ve n | 


sinh‘ p sinh? n sinh? p sinh? n 
I Pp 


a | ee n® cosh np | = nr on 
sinh’ p sinh*® np sinh? p= sinh? np_| © 


Now as p — 0, the right hand side tends to 3 —[6(n® + 1)|/[5(n® — 1]. For 


n = 2 this is 1, as it should be, and as n — ~ it approaches the value 9/5. It 


(6.6) 





EXPONENTIAL REGRESSION 189 


may be shown also that for n > 4, p 4 5 — 2c“?! on p—+,s0 that 1, s = 
5 for moderate values of p. 

If u = (sinh’ p)’ — n? (sinh’ np)’, the expression for p° can be written as 
(6.7) p= 3 — uta + a”. 


Hence 
(6.8) d(p*)/dp = u“[3u"® — 4uu'u’ + wu'’’). 


By expanding in powers of e~” it can be shown that the terms in e” and the 
constant term vanish, so that 


d(p*)/dp = O(e*”), p> od, 


for n > 4. Hence p © has no maximum or minimum at any point p, apart from 
the minimum at p = 0. The radius of curvature is therefore finite and remains 
between 1/4/5 and [(5n> — 5)/(9n* — 21)]', i.e., between 0.447 and 0.745, for 
any n > 4. 

The condition for no local overlapping at any point of the tube is, therefore, 
sin 6 < 0.447, or equivalently, cos 6 > 0.894, where @ is the geodesic radius of 
the tube. 

For the modified exponential curve, (6.2) still holds, with Y4 replaced by 


vr 


Y. = (g — f’/ny* (e* — f/n). We now have 


2 a . « f -2~ 
(6.9)  (ds/dp)? = (4 sinh? pfs _ 1— 1» sinh (p/2)(sinh np/2)” | 


n tanh (p/2)(tanh np/2) — 1° 


I have not been able to obtain an explicit expression for the curvature of the 
axis of the tube, similar to (6.6). However, for small values of p, ey and ds/dp 
may be expressed in series of powers of p, and I find after much algebraic cal- 
culation that when p — 0, 


he 
(6.10) 1/p is 212n! 5 544 
7n* — 56n? + 112 

For n = 3 this reduces to 1, as it should, since in this case the curve is an are 
of a unit circle. 

Asn— ©,p— (7 19)* = 0.607, so that the radius of curvature at the centre 
of the axial curve of the tube lies between 0.607 and 1 for all values of n > 3. 

To find the curvature at the ends of the axial curve we need the limit of 1/p 
as p — +. Since the curve is symmetrical about p = 0, it is sufficient to con- 
sider p — «. For n > 5, it may be shown that, as p > ~, 


» 5 ~ 4 3 
, iui 5n” — 35n° + 78n° — 108n — 12 
“— eae 


Asn — ©, p— 1/+/5 = 0.446, as for the simpler case treated above. For 
= 6, p has the value 0.519. 


20 petit LL ECC RATNER ARE 





190 E. 8. KEEPING 


Intuitively, one would expect for Y = a + be”*, as we have shown for Y = 
be”*, that the curvature of the projection would vary monotonically between 
the centre and the ends. A proof of this would, however, be desirable. Assuming 
that this statement is true, we have as the condition for no local overlapping 
that cos @ > 0.894. 

It is of interest to see how much of the length of the tube will have a radius 
of curvature near to the minimum value 0.477, which corresponds to p = +. 
Now, for n > 5, and for large p, equation (6.6) can be written 
(6.12) p= 5 — 2e777l 4 Ce FP!) 
so that for | p| > 1, the first two terms may be considered a fair approximation. 
If we take |p| = 2 log 2 = 1.386, p = 39/8, or p = 0.453, approximately. 
The length of the part of the tube for which the radius of curvature lies between 
0.447 and 0.453 is 


1/4 
2 f Q—w) fl — nw — oP — ue") *} du, 
0 


which equals 0.511 approximately for n > 5. Since for n = 5, the whole length 
of the tube is 2.59, nearly one-fifth of this length has a radius of curvature be- 
tween 0.447 and 0.453. 

As n increases, the ratio of this part to the total length diminishes to zero, 
but for n = 100 it is still about 1/11. Hence it appears that local overlapping 
may be serious for values of cos @ appreciably less than the critical value. 

Nonlocal overlapping will not occur. It is necessary for such overlapping that 
the tube should bend around so that two points P; and P: of the axis are at a 
geodesic distance apart less than twice the radius of the tube even though they 
are separated by a considerably greater distance than this, measured along the 
axis. 

If P; and P, correspond to values q and q» of the parameter g, the square of 
the distance in Euclidean n-spf_ 2 between them is given by 


(6.13) D = (dq: — Agr)’, 


where 

M = qe (1 — gi)(l — gi"), i= 1,2. 
If we transform to polar coordinates 6,,--- , 0,1 on the unit hypersphere and 
let P; be the point (0,0, --- , 0) and P, the point (a, 0, --- , 0), then D* = 
2 — 2 cos a and the geodesic distance between P; and P» is a. If, therefore, D 
is the distance from a fixed point q to a variable point gq, 


(6.14) pi=2—-git gu) - q)(l =w\ 
l-—qn \( — g")(1 — qi")) 
and a minimum value of D° corresponds to a maximum value of 
n _n\2 2 2 
(6.15) = ( q a) (a = g)G = a) 
1—qn / (1 —q")(1 — gi’) 





EXPONENTIAL REGRESSION 191 


‘The ends of the axis of the tube are at a geodesic distance 7/2 apart. One end 
of the axis is at the point where the positive x; coordinate axis cuts the unit hyper- 
sphere and the other end is at the point where the positive x, axis cuts it. The 
axis of the tube lies wholly on that part of the surface of the hypersphere for 
which all coordinates are positive, and so cannot spiral around the end points 
or form an equatorial spiral around the sphere in the middle. 

If there is nonlocal overlapping there must be at least three distinct roots of 
(6.15), considered as an equation in g corresponding to a given q and a given 
a (less than twice the geodesic radius of the tube). Two of the roots will represent 
neighbouring points on the axis, one on each side of gq, , and the others, if they 
exist, correspond to points on nonlocal portions of the axis. If the point q is at 
a geodesic distance less than a from either end of the axis of the tube, the existence 
of two distinct roots would imply nonlocal overlapping. 

Since the tube is symmetrical about the middle of its axis (at g = 1) we can 
assume 0 < q; < 1. Then gq can take any real value from 0 to «. By the condition 
for the absence of local overlapping, cos’ a > 0.360. 

If q. = 0, the equation for g becomes 


(6.16) l+gdtq@t+::-+q"" —se’a =0, 


and this, by Descartes’ rule of signs, has at most one real root. There is therefore 
no nonlocal overlapping at the ends of the tube, for any value of n. This is also 
true at the middle, where q = 1. 

For any finite n the geodesic distance 8 of qg, from the end g = 0 is given by 
1 — qi = (1 — qi") cos’ 8. If there is to be no nonlocal overlapping, the equation 
in q, 


oe nn\2 2 ay 2 2 
(6.17) (tes) (=) = (a 

1 — qn 1 — gq cos? 6 
should possess only one real positive root if 8 < a@ and only two real positive 
roots if 8B > a. If 8B = a, one root is g = 0. The equation in this case becomes 


(6.18) 1+ ¢ fiver + gn? = (1 + an de cee gt. 


Writing yy, = 1+ g@ +--+ q@”’ andm = (lt+an+-:-+q gq), it 
is clear that y, and ye and all their derivatives are nonnegative for all gq, that y; 
and y2 are never zero, and that at q = 0, Ys > 0. The curve of ys as a function 
of q starts together with that of y, at g = 0 and remains above the latter for 
an interval of g > 0. When q tends to infinity, y2/y; > qi"*, which is less than 
1, so that the curves must eventually cross again. A real root of y, = y2 greater 
than zero therefore exists. We shall now show that this root is unique. 

Since y,; < yi < yw for 0 < q < qm, We can confine our attention to the case 
q>u- 

If q. = 1, (6.18) clearly has no real root for g > 0. Moreover, y2 is a continuous 
function of g; . Hence, if for any q , between 0 and 1, two real roots exist, there 
must be some gq and some corresponding g such that (6.18) has a double root. 


ac i ey 2 AR ENED Ma RUNS ME NA te 


Ne AA EIS OORT a a 


cacti hom Sagas 





192 E. 8S. KEEPING 


It is sufficient therefore to show that such a double root cannot exist. The con- 
ditions are 


(6.19) "Yr — yz = 0, vi — yz = 0. 
Writing yi = [n-—- Gl + ¢ +--+ Q")/0 — ¢) and ys = Vaan — 


n—-l an-l 


qnQl + aqa+ :-+: +q° qr )\/C1 — aq), the second condition of (6.19) gives 


n-¢—qg—::-—¢@ 1 — qu 
6.20) Seer ee ee ee ee ee Oe ee es 
¢ choot cee tee ee 


From the first condition, 


no Yi l+¢@t+s:-- +q" Y 
‘Yo =~ a oa Ce ee oe 
va Vn litent: +e CS 


Substituting for ~/y2 in (6.20), subtracting 1 from both sides, removing a common 
factor g — qq , cross-multiplying and collecting terms, we arrive at an equation 
of the form 


A + Ban + Cq’qi + --: + Zq”" ar” = 0, 


where all the coefficients are positive. This can obviously not be satisfied for 
positive gq and q . 

In the more general case of (6.17) the equation is y; = cyz, wherec = cos B/ 
cos a, and it is readily verified that the above argument holds for c > 1. If 
c < 1, the curve for cy2 is below that for y; at g = 0 and at g = ~, so that ifa 
real root exists at all there will be at least two roots. These roots cannot coincide, 
since if they did we should have y; = cy2 and y; = cys simultaneously, which is 
ruled out by the above argument. The same argument also shows that there 
cannot be more than two roots. That there are at least two follows frorn the 
fact that when y, = sec’ 8, say at x = b, cy = sec’ @ sec’ a, so that y;/cy2 = 
1/sec’ a < 1. The curve for y; is therefore below that of cy, at x = b. 

For the modified exponential curve the expression for cos’ a corresponding to 
(6.15) is 


( aac non = n oa n\\ 2 
(6.21) cos” a= {hace — (1 — gr)(l — 9") s | ip, : 


Ll — qn n(l — q)(1l — q)) / 


where 


pelts ae () ~ t) 
; l1-—¢ n\l—q 
and /, is the same expression with q instead of q. 
When qi = 1, 


(6.22) cos’ a 





EXPONENTIAL REGRESSION 


which may be written 


1+q" {1 + 3(n — 1) sec’ a/(n + 1)} 


or 


pic, i dite anny f 1 in —_eww o) - go 
Q+gQ+qtqt +4q (1 + Se Dee (1+ q") = 


Since in this equation there are only two changes of sign, the factor 


1. 3(n — 1) sec’ a 


— + ——____—_——_ being less than 1 for admissible values of a, there cannot be 
n n(n + 1) 


more than two real positive roots. Hence there is no nonlocal overlapping near 
the middle of the tube. 
When qi = 0, 


wan 8 f- el / 
(6.23) cos a i nao} /* 


It may be shown that the derivative of the right-hand side, considered as a 
function of g, is negative for all values of g > 0. Since the right-hand side is 
equal to 1 for g = 0 and to (n — 1) for g = ©, there is just one real positive 
root for any admissible value of cos’ a. Therefore no nonlocal overlapping is 
possible at the ends of the tube. 

Moreover, it is readily shown that cos’ @ in (6.22) has no maximum or minimum 
for any value of g except 0 or 1. That is, the geodesic distance from the midpoint 
of the tube to a variable point P of the axis increases monotonically as P moves 
away towards either end. The same conclusion follows from (6.23) as P moves 
away from the end of the tube towards the middle. This circumstance, which of 
course holds also for the simple exponential tube, suggests that the possibility 
of nonlocal overlapping is effectively ruled out. 


7. Probability formulas and tables. The ‘‘volume” of a tube of geodesic 
radius @ surrounding the projected curve will be given by 


(7.1) Lr" sin” 6/T'(4n), 


where l, is a function of n evaluated in Section 5. For any value of n > 2 there 
will also be hemispherical ‘“‘caps’’ at the ends of the curve. The “‘volume”’ of a 
complete cap of radius @ surrounding a given point on the hypersphere is 


Og h(n? 2rsin(6/2) f° 

- + n—3 

: — cos gy sin” o dg ds 
* r[4(n — 2)]} l 0 

(7.2) : 


= 2r'" sin” (6/2)/T(4n), 





194 E. S. KEEPING 


and we may consider this as the sum of the two hemispherical caps at the ends. 
The probability that a random sample point will lie within the tube is there- 
fore given by 


(7.3) p(@) = tn sin"* @ + sin"* (6/2). 
2r 


In terms of the correlation coefficient R, the probability of obtaining by chance 
a value of R at least as great as Rp is 


(7.4) p(Ro) = = _ RD + {i(1 — Ry) 


This, of course, is true only for values of Ry > 0.894. It will often happen, how- 
ever, that when the data suggest an exponential trend the correlation between 
observed and fitted values will be high. 

The value of Ry corresponding to an assigned significance level can be cal- 
culated from equation (7.4) for particular values of n. A few such values are 
given in Table III, where it will be noted that in each column the last entry is 
below the critical value. 


TABLE III 


Values of correlation coefficient corresponding to certain significance 
levels (Y = be?*) 


Significance level 


Ol .001 .0001 


. 9999 .9999 

999 .9999 

991 .9996 
976 .992 
956 .983 
935 .970 
9 912 955 
10 889 .939 
12 | 906 
15 859 





For the modified exponential, the probability of obtaining by chance a value 
of the correlation coefficient R between observed and computed values of y 
at least as great as Ry, on the null hypothesis that b = 0 in the parent popula- 
tion, is given by 


, 


ln 2) 4(n—3) n—2) 
p(Ro) = = (1 — Ro) + (30 — Ro), 
Qn 
for Ry > 0.894. This is the same formula as (7.4) with n — 1 instead of n, because 


e e ° ° . Ye ° is ° 
of the loss of one dimension in projection. Also, /, is now given by (5.12), instead 





EXPONENTIAL REGRESSION 195 


of (5.7). A short table, computed on the basis of this formula and assuming that 
no overlapping exists, is appended as Table IV. 


TABLE IV 


Correlation coefficient corresponding to certain significance levels (Y = a + be”*) 


Significance level 


01 001 .0001 





. 9999 .9999 
.998 . 9998 
.989 .998 
.972 .991 
.951 .981 
.904 .952 
. 881 .935 
11 .918 
12 .901 


Note that for Table III the correlation coefficient is computed without elim- 
ination of the means, whereas for Table IV the correlation coefficient is computed 
in the usual way. 


8. Methods of fitting curves. As n increases, smaller values of R become sig- 
nificant at any given level, and for moderately large n these values of R, at the 
lower levels, pass out of the region for which the calculation is valid. However, 
the table may be useful in deciding whether an assumed exponential regression 
is plausible, when the number of sample points available is fairly small. 

Tables for use in fitting exponential curves may be found in Glover’s Tables 
[3] but unfortunately these tables cover a very limited range of values of p 
(from 0 to 0.0953, e” between 1.0 and 1.1). The exact solution by least squares 
is laborious. Values of b and p are calculated from the equations 

Zye"" = bye’"*, 
(8.1) , 
Laye’* = brae””". 


Rough approximations to b and p may be found by fitting a straight line to the 
values of log y, and these approximations may be improved by Seidel’s method. 

Villars [4] has recently given some approximate methods of fitting the modified 
exponential equation y = a + be”. In the first method the n observations (n even) 
are divided into two groups, one including the Ist, 3rd, 5th, etc., and the other 
the 2nd, 4th, 6th, etc. The relationship between the expectations of corresponding 
members of the two groups is 


(8.2) v3 = h+mu;, j = 1,2,--+,n/2, 





196 E. S. KEEPING 


where 


(23-1) p 
fa = a+ be’ ’”, 
(8.3) ae] ! 


vi = a + be”, 
and som = e’,h = a(l — e”). Hence if a straight line through the origin is fitted 
to the observed u and v values, where u; = y2;-1, 0; = Ye; , both variables being 
subject to error, the slope of this line will give an estimate of e” and its intercept 


will give an estimate of a(1 — e”). An estimate of b can then be found from (8.5), 
or alternatively both a and b can be found from (8.4) and (8.5), where 


(8.4) Na + b=im* = dy, 
(8.5) alm* + b Um = Sym". 


An alternative method, also given by Villars, is applicable whether n is odd 
or even. It consists in treating y; and y;41,j7 = 1,2,---,n — 1, as pairs of cor- 
responding u and »v values, although, since each u except the last appears also 
as a v, the pairs are clearly not independent. 

A systematic method of calculating the exact least squares solution, starting 
with an approximate value of p (or of g = e”), has been presented by W. J. 
Spillman [7]. This method utilises tables of g° for x between 2 and 20 and for 
q at intervals of .01 between 0 and 1. 

There is, of course, no point in fitting an exponential equation by least squares, 
or by any more or less equivalent method, unless there is some reason to believe 
that the underlying assumption of approximate uniformity in the variance of 
j is valid. If log y is approximately normal with constant standard deviation, 
as seems to be true for many data in the field of economics, the usual procedure 
of fitting a straight line to the logarithms of the values of y is clearly justified. 
On the other hand, if the standard deviation of y is constant, the effect of this 
procedure is to give undue weight to the smaller values of y. Some data exist on 
fertilizer trials in which the assumption of constant standard deviation of 
seems reasonable, and which suggest an exponential, or modified exponential, 
trend. 


9. Numerical illustrations. ‘The data in Table V, referring to the mean girths 
y of rubber trees in inches after various levels x of fertilizer treatment, I owe 
to the courtesy of Mr. H. Fairfield Smith. 


TABLE V 





EXPONENTIAL REGRESSION 197 


It will be observed that the values of x are not equidistant. This makes the cal- 
culations more awkward. 

If the fitted equation is Y = a + bg’, where q = e”, the least squares equations 
for a, b, g are 


5a + D1 + q+ q + q + q'| = Yo t+ ys + Ys t Yr; 


afl +q+ q = q + q'| + bil + q + ¢ 4- q° + q''| 


(9.1) = y+ mgt yg + yg + urd, 
a{l + 3q° oe 5q' - 7q'] + blq + 3q° + 5q° as 7q"| 
= yi, + 3ysq° + Sysq* + Tyrq° 


Approximate values for a, b, and qg were found by Cowden’s method [6]. A 
curve was drawn by eye between the plotted points and a trial value of a cal- 
culated from three equidistant ordinates, Yo , Yi, Ye, by the formula 


_.. We¥e ~ ¥% 
~ aa 9 Ams | A 


Values of Y — a were then plotted on semi-logarithmic paper, and the value of 
a was adjusted by trial so that a straight line fitted the points reasonably well. 
From this straight line, values of b and q were obtained, b being the ordinate 
at x = 0, and q’ the ratio of the ordinates at x = 7 and x = 0. In this way the 
following approximate values were calculated: 


a = 22.5, bo = —2.0, _ 0.70. 


One application of Seidel’s method (solving linear equations in éa, 5b, dc) gave 
the improved values 


a = 22.47, b = —1.945, q = 0.7000. 


Using these values, the calculated Y of Table V were obtained. The correlation 
coefficient between observed y and computed Y is 0.99735, which corresponds 
to n = 5. If, ignoring the slight deviation from uniformity in the z-intervals, 
we use Table IV, we find that P is slightly >0.001. The null hypothesis of no 
effect of the fertilizer is decisively rejected. 

Villars [4] gave an illustration of the fitting of an exponential eurve to data 
referring to a certain property of a rubber latex. By his first method he obtained 
as the regression equation 


(9.3) Y = 1.0009 2 0.28776 0 


, 


where 2x = ¢ + 1 of his formula (4.1). By the second method he obtained the 
equivalent of 


(9.4) Y = 1.0000 — 0.28116°°"™, 


If the correlation coefficients between Y and y are computed for both equations, 
the values turn out to be 0.9560 and 0.9572, respectively, so that the second 





198 E. S. KEEPING 


method gives a slightly better fit. The number of observations, however, is large 
enough (sixteen) for such a coefficient of correlation to be very highly significant, 
with reference to the null hypothesis. 

For the purposes of illustration, we will use only the first six of Villars’ ob- 
servations, given in the first two columns of Table VI. 


TABLE VI 





Y (method 1) Y (method 2) 


771i 0.7742 
855% 0.8415 
. 8826 0.8754 
.8914 0.8924 
. 8942 10 


0.90 
. 904 8951 0.9053 


By Villars’ first method the values of Y given in column 3 were calculated, 
corresponding to the equation Y = 0.8955 — 0.385le°""”*. The correlation 
coefficient between y and Y is R = 0.871, so that from Table IV the departure 
from the null hypothesis is barely significant. Villars’ second method gives the 
values in column 4, corresponding to Y = 0.9088 — 0.2693e°*"", and in this 
case R = 0.906. The fit is therefore appreciably better, and the regression ap- 
pears to be significant at a level about midway between .01 and .05. 


10. Acknowledgement. In conclusion, the author would like to express his 
great indebtedness to Professor Harold Hotelling for having suggested the 
problem and for helpful advice and criticism. 


REFERENCES 

{1] Haroup Hore uine, ‘“Tubes and spheres in n-spaces and a class of statistical problems,” 
Am. Jour. of Math., Vol. 61 (1939), p. 440. 

[2] D. M. Starkey, ‘‘The distribution of the multiple correlation coefficient in periodogram 
analysis,’’ Annals of Math. Stat., Vol. 10 (1939), p. 327. 

[3] J. W. Grover, Tables of Applied Mathematics, George Wahr, Ann Arbor, Mich., 1923, 
pp. 468-481. 

[4] D.S. Viniars, ‘A significance test and estimation in the case of exponential regression,”’ 
Annals of Math. Stat., Vol. 18 (1947), p. 596. 

[5] D. 8S. Vituars anp T. W. AnpeErson, “Significance tests for normal bivariate distribu- 
tions,’’ Annals of Math. Stat., Vol. 14 (1943), p. 141. 

[6] D. J. Cowpen, “‘Simplified methods of fitting certain types of growth curves,” Jour. 
Am. Stat. Assn., Vol. 42 (1947), p. 585. 

7) W. J. Spruuman, Use of the Exponential Yield Curve in Fertilizer Experiments (Technical 
Bulletin No. 348), U. S. Department of Agriculture, Washington, D. C. 





ON THE DURATION OF RANDOM WALKS! 


By Wo.tFrGanc Wasow 


Institute for Numerical Analysis 


Summary and introduction. In a recent paper [1] the author investigated the 
mean number of steps in random walks in n-dimensional domains. The purpose 
of the present article is to generalize those results by applying similar methods 
to the study of the moment generating function for the number of steps and of 
its distribution function. As an application explicit asymptotic expressions for 
the variance in special cases and estimates for the likelihood of very long walks 
are obtained. 

The author wishes to express his thanks to Professor R. Fortet for many 
helpful discussions. 

The walks take place in an open bounded domain B of n-dimensional Euclidean 
space E with boundary C. A point moves in £ according to a given transition 
probability law F(y, x). Here x and y are points of E with coordinates z; ,i = 1, 
2,-:--,n,and y,7% = 1, 2,---,n, and F(y, x) is the probability that a jump 
known to start at x end at a point all of whose coordinates are less than the 
corresponding ones of y. The function F(y, x) is a distribution function with 
respect to y, and it is assumed to be Borel measurable with respect to all vari- 
ables. Let N = N, be the number of steps in a random walk that begins at a 
point z of B and ends with the step on which the moving point leaves B for the 
first time. If the probability of the moving point eventually leaving B is equal 
to one, then N is a random variable. It is called the duration of the walk. It is 
useful to extend the definition of N by setting 


N, = 0, zeE — B. 


1. The moment generating function of the duration. The probability distribu- 
tion of the duration, given by the functions 


(1.1) p(x) = Pr{N, = k}, 0, 1, 2, --> 
satisfies the recursion relations 


{ 
[ mw) ary, 2), eB, 
zg 


Pixi(x) = 
0, zeE — B, 
0, zeB, 
Por) = 
i. zeE— B. 


Here and in the sequel all Stieltjes differentials are formed with respect to the 
first argument. 


1 The preparation of this paper was sponsored in part by the Office of Naval Research. 
199 





200 WOLFGANG WASOW 


We need some hypothesis sufficient to ensure the relation }-f» p(x) = 1 
and the existence of the moment generating function of N, . This is the purpose of 
Assumption A. There exists a positive integer m and a positive number c < 1, 


both independent of x, such that 


Pr{N, >m}<c 
for all x in B. 
From this assumption, which is slightly more general than the corresponding 
condition in [1], the equality) f. p.(z) = 1 follows by a simple argument 
similar to that in [2], pp. 431-432. In fact, the last inequality implies 


Pr{N, > jm} < c? 
and therefore 


mj 
lim >) p(x) = lim [1 — Pr{Nz > jm}] > lim (1 — c’) = 1, 
jo k=O jn j7x 
The aim of this section is the following theorem: 
THEOREM 1. Jf Assumption A is satisfied, then the moment generating function 
o(s, 2) = 2 F0e''p;(x) of the duration N, exists in a complex neighborhood of 
s = 0 and is the unique solution of the integral equation problem 


Je [ os, y) dF(y,z), zeB, 


(1.3) o(s, z) = 
\1, zreE—B., 
Because of later need we prove a slightly more general result. 


Lemma 1. If Assumption A is satisfied and f(x) is a real function such that 
| f(z) | < K in E — B, then the integral equation problem 


e"| u(s, y) dF(y, x), zeB, 
(1.4) u(s,z) = I, . 


\ f(z), zeE —B, 
possesses for Re s < 80(s% > 0) a unique solution. This solution satisfies the in- 
equality 
(1.5) | u(s, x) | < o(Re 8, z)-K, rE B, 
where $(s, x), the moment generating function of the duration, is the solution of 
(1.3). 
Proor. Assume, at first, that f(z) is nonnegative, and that s is real. Set 
(0, ze 8. 
u(r) = 4 
f(z), ae E — B, 


( 


; |e [uw dF (y, x), ze B, 
Ue41(Z) * 


f(x), ce Ek — B. 





RANDOM WALKS 
Then 
(1.7) u(x) = > eas), zeB, 
where, for z in B, 
as) a) =f fy) dF, 2), ante) = [| ow ar 2). 


The q;(x) are nonnegative, and the u(x) form therefore a nondecreasing se- 
quence. Iterating (1.6) m times we find, for k > m, 


(1.9) tass(z) = e™ ff tu n(2) dPa(v, 2) + xu(2), ze B, 
where 

Fy, 2) = Fy,2), Fay, 2) = ff Paalu, 2) dF, 2), 
and the x,,(z) are bounded and nonnegative. If z is in B, then, by Assumption A, 


[ araty, 2) = PriN, > m}) Se< 1 
B 


iL, = Lub. u(z), M = 1. u. b. xm(z). 
zeB 


zeB 
(L, and M depend on s.) Then, by (1.9), 

Lear < e™ Lye + M. 
Hence, 

Ina < M/(1 — e™e), 


and, therefore, the nondecreasing sequence u,(x) is bounded for all s for which 
e™c < 1. Thus, it tends to a limit u(x) = u(s, xz), which satisfies (1.4) and can 
be written, by (1.7), in the form 


(1.10) u(s,z) = i e* q;(z). 
j=l 


Since this is a power series in e* it converges for complex s also, as long as Re s < 
8 = —logc/m. Furthermore, we see by comparison of (1.8) and (1.2) that for 
t(z) = 1 we have q(x) = ps(x) and u(s, z) = ¢(s, 2). 

To prove the uniqueness, it suffices to show that f(z) = 0 implies u(z) = 0. 
By iteration of (1.4) we find, for f(z) = 0, 


u(x) = e™ [ a) dF ,,(y, 2), 


: 
: 
: 
2 
i 
: 
i 
4 
4 
; 
; 


LAE LE LILLE PLE LEE NADP BRR Re Cg 





202 WOLFGANG WASOW 
and hence 


lu. b. | u(z)| < |e™ | c-Lu.b.| uly) |. 
zeB yeB 
For values of s such that | e™ | ¢ < 1 this implies, indeed, that u(x) = 0. 


If f(z) is not necessarily positive, then we consider the integral equation 
problems 


je u- dF in B, 
u” = s 


\K in E — B, 


, le / u” dF in B, 
Uu ” = ) BE 
\K — f(x) in E — B, 


which do have unique solutions by what has been proved already. u = 
u” — u® is therefore the unique solution of the original problem. We note that 
this argument also éxtends the validity of the formulas (1.8) and (1.10) to the 
case that f(x) is not necessarily positive. 

Finally, the inequality (1.5) follows easily from (1.8), (1.2), and (1.10), since 


| u(z) | < Z. elke: g;(x) \<K > e**** p;(z) = Ko(Re 8, x). 
j=l 


j=l 
~ This completes the proof of Lemma 1. Theorem 1 implies that all moments 
M,(x) of N, exist. 


THEOREM 2. The kth moment M,(zx) of the duration N, satisfies, for k > 0, the 
integral equation problem 


[ M;,(y) dF(y, x) + fi(x), 228, 
4 B 


(1.11) Mi(z) = 
.0, zeE—B, 
where 


(1.12) f(x) = . (— 1)" (*) M,_;(z). 


j=l 


Proor. From the definition 
Miz) = Dip) 
j= 
follows by an application of (1.2) that, for x in B, 


(1.13) [ Milg) aFy, 2) = ¥ Gj 1) a. 
BE j=1 





RANDOM WALKS 203 


Expansion of the binomial expression in the second member followed by an 
interchange of summations proves the theorem. 

These integral equations for the moments form an inductive sequence, since 
fx(z) depends only on the moments of order 7 < k. The equation for M,(x) was 
the main subject of [1]. 


2. An asymptotic differential equation for the moment generating function. 
In the important case that the transition probability F(y, x) is strongly con- 
centrated about x, the integral equation (1.3) will now be shown to be approxi- 
mately equivalent to a differential equation. To do this it will be assumed, as 
in [1] and [2], that F(y, x) = F(y, x, u) depends on a small positive parameter 
» in such a way that the following three hypotheses are satisfied. 

AssumPTION B. Denote by a;(x, u), ba(x, wu), t, k = 1, 2,---, n, the first and 
second moments of F(y, x, u) about x. Then 


(2.1) a,(x, w) = ai(x)u + oly), 
(2.2) ba(z, u) = Ba(x)u + o(x). 
These relations hold uniformly for x in B. The a(x) and 8%(x) are twice continu- 
ously differentiable in B + C. 
AssumpTION L. Let K,(x) denote the sphere of radius r with center at x. Then 


[ (yi ea Zs) (Ye — Zr) dF(y, z, h) - o(u), i, k= a, 2, ete Ri, 
B-K, (2) 


uniformly with respect to x in B, for any fixed r > 0. 

These assumptions could very likely be weakened to the equivalent of the 
analogous hypotheses in [12]. 

AssuMPTION E. The matrix {8(x)}, which is obviously nonnegative, ts posi- 
tive definite in B + C. 

For what follows we also require a certain degree of smoothness of the 
boundary. 

AssumpTiIon 8. The boundary C has a continuously turning tangent plane. 
(This restriction could be considerably weakened; e.g., by inserting the word 
“niecewise’’, cf. [2], p. 438.) 

We prove first 

Lemma 2. Assumptions B, L, and E imply Assumption A, at least for suffi- 
ciently small p. 

Proor. To simplify the notation we give the proof first for the one-dimen- 
sional case, in which we can write a, 8, z, y for a; , Ba , 2% , y; . Using Assump- 
tions B and L we have, for any e > 0, 


bu = [ wy — 2) arW,2,0) +0W) = [2 aP 


z z+e z 
+f w-oartow sel w-aar-ef Y-DaF+o0W) 





204 WOLFGANG WASOW 


and also 


an = | (y — x) dF + [ (y — x) dF + ou). 
Multiplying the last equality by « and adding it to the preceding inequality, 
we find 


ezt+e 


(8 — €a)u + o(u) < 2 | (y — x) dF. 


Since B(z) > const. > 0 in B, by assumption E, we see, by first choosing « 
sufficiently small, that for small » 


/ (y— x) dF > Cu, zeB, 


where C is an (arbitrary) positive constant. 
On the other hand, for p’ < «, 


r+e -z+yu? pre = 
/ y-dr=[  Gy-aaFt+f[ w-narsi+ «| dF. 
z z zu? z+u2 
Combining this with the preceding inequality, we have 


Priy—2x>p} > Cu, ze B, 


where C; is another constant. Hence Assumption A is satisfied, if m is chosen 
greater than the diameter D of B divided by yu’, and c = 1 — (Cys)?**, 

In more than one dimension the same argument can be applied in any one 
of the coordinate directions. The changes necessary concern only the notation. 
Thus the lemma is proved. 

Denote by ¥(s, x, u) the moment generating function of the random variable 


= pN,. 
Obviously 
(2.3) ¥(s, , uw) = o(us, 2, w). 


Let furthermore L[{u] be an abbreviation for the operator 


n s n , 
2.4) Lu) = 4 & Bale) + Yaj(z) @. 
ikem Ox; OX; j=l OX; 


Then the following theorem will be proved. 
THEOREM 3. Jf Assumptions B, L, E, and S are satisfied, then the moment 


generating function ¥(s, x, u) of the random variable t = uN satisfies the limit 
relation 


(2.5) lim (s, x, wu) = u(s, x) 


uo 





RANDOM WALKS 


uniformly for x in B,|\s| < 8, where u(s, x) ts the solution of the problem 


L{u] + su = 0, zeB, 
(2.6) u=1, zceE—B, 


Before we can prove the limit relation (2.5) we have to prove separately the 
weaker statement that ¥(s, z, u) remains bounded as up — 0. 

Lemma 3. If Assumptions B, L, E, and S are satisfied, then there exist two 
positive constants C and C’ , independent of u, such that 


(2.7) i¥(s,z,u)|<C for ls} <C’. 
Proor. From the results of [1], in particular from Lemma 3, Theorem 1, and 


Theorem 2 of that paper, it follows that the solution of the problem 


(2.8) M(2) = [ Mw are, 2, w) + $e), oek, 


0, zeE—B, 
satisfies the inequality 
(2.9) |M(z)|< Ci Lub | f(x)\/n, 
where C, is a constant independent of ». This remains true if f(z) depends 


boundedly on y». We wish to apply this inequality to the integral equations 
(1.11). To that end we first note that 


(2.10) | fe(x) | S KMi(2), 
for f,(z) can be written, by (1.11) and (1.13), in the form 


fle) = Mila) — > Y-1Paa = > it -— G — Dip) 


= k a p;(z), 
f= 
where j — 1 < j* < j, and, therefore, 
O<filz) Sk zi pi(xz) = kMea(z). 


Applying (2.9) and (2.10) inductively to (1.11) we obtain 
(2.11) | M(x) | < kUC,/p)*. 


Substitution of this inequality into the formula 


v(s, z, ry) — D> a (us)* 
k=O $ 





WOLFGANG WASOW 


1 


1 — Cis 


iv (s, a, m) | < dX (C,s)* = 


for |s| < C,. This proves Lemma 3. 

Proor or THEOREM 3. The basic idea of the proof is similar to that of Theorem 
2 in [1] and thus to Petrowsky’s reasoning in [2]; we show that u(x) satisfies an 
integral equation little different from that for ¥(s, x, u), and conclude from that 
fact that the two functions are nearly the same. 

We first replace u(x) by a slightly different function u(x) defined in a larger 
domain B’’, in order to avoid extraneous difficulties near the boundary C. This 
can be done by constructing a twice continuously differentiable mapping 


a; = f(z, 4), zeB, += 1,2,---,n, 


which is continuous in 4, for 6 > 0, together with its first and second derivatives 
with respect to z, and has the following properties. 

(a) It reduces to the identity for 6 — 0. 

(b) It is, for all 5, the identity transformation in a subdomain B’ of B 

that tends to B as 6 — 0. 

(c) It maps B onto a domain B” containing B in its interior for 6 > 0. 
For the explicit construction of such a mapping with the help of Assumption S 
we refer to [2]. 

If we define u;(x) in B by 


us(x’) = u(x), 2e8, 


then this new function is defined and twice continuously differentiable in B’’. 
It tends to u(x), uniformly in B, together with its first and second derivatives. 
We extend the definition into the whole space E by setting 


us(z) = 1, xeE— B. 


Next, it can be shown that, for any « > 0, we can choose first a 6 > 0 and 
then a wo > 0 such that, for|s| < 3, 


(2.12) / us(y) dF(y, x) = (1 — ps)us(x) + ug(s, 2, w), 2 eB, 
E 


where 
(2.13) ole, @,.) 1. < ¢, 


provided u < yo. The proof of this statement resembles so much the analogous 
arguments in [1] and [2] that it will be omitted here. (Formula (2.12) is es- 
sentially the result of expanding u;(x) about z by Taylor’s formula up to quad- 
ratic terms and applying Assumptions B and L.) 





RANDOM WALKS 207 
Because of the definition of u(x) it can also be assumed that 6 has been 
chosen so small that 
(2.14) lu —ul|<e, zeB, 
(2.15) luw—I1|<e, zekK — B. 
For yo sufficiently small, 
| (1 — ws — &™)us| < ue, zeB, 


Therefore we can write, instead of (2.12), 


ef" [ us(y) dF(y, 2) = w(x) + uh(s, 2, 2), 


with 
(2.16) | A(s, x, uw) | < de. 


° ° 1) (2) 
We now split us into the sum us = uj” + uj”, where 


=) (1) . ( . + 
(2.17) us? e | us dF in B, =u; in E— B, 
E 


2.18) uu? =e" [ ae as eo a toe BR 
Eg 


To estimate uj” we subtract it from (1.3) with su substituted for s, and use 


(2.15), (2.3), Lemma 1, and (2.7). This yields 

(2.19) ius? — y| < CKe, ze B. 
This implies, in particular, that uj” is bounded as » — 0. Therefore (2.18) can 
be written, for sufficiently small yo , in the form 


us = / us? dF + u(sus”? — h*) in B, ; =0 in E-B, 
E 


where 

|h* | = | h*(s, x, w) | < 4e. 
Application of (2.9) yields 

lu.b. | us” | < Cy(j s | Lub. | uj” | + 40), 
i.€., 
(2.20) | us? | < 4€C,/(1 — Ci | 8 |) 
for|s|< so < 1/C,. Combining (2.19) and (2.20) we obtain the inequality 
|u— wy] < const. e¢, re B, 


which proves Theorem 3. 





208 WOLFGANG WASOW 


THEOREM 4. If Assumptions B, L, E, and S are satisfied, then the kth moment 
M(x, uw) of the duration N, satisfies, uniformly in B, the relation 


lim »‘M;,(x, w) = m,(z), k>1, 
u—0 


where the m,(x) are defined recursively by the conditions 
m(x) = 1, 
and, fork > 1, 
L{m,] + km, = 0, Ze B, 


m, = 0, zon C. 
Proor. The solution u(x, s) of (2.6) is connected with the functions m,(z) 

by the relation 

= m;(x) 

u(s,z) = z mga, s’, 

j=0 J 
as can be seen by replacing u(z, s) in (2.6) by its series in powers of s and collect 
ing coefficients of like powers. By Theorem 3 the function 


= 7M (x) 
V(s, a, b) - z, = ts(z) 8° 
j=0 J: 
tends uniformly to u(x, s) as » — 0; hence the coefficients of the first power 
series are the uniform limits of those of the second. This proves Theorem 4. 
For 7 = 1, Theorem 4 was proved in [1]. 


3. An asymptotic differential equation for the distribution function of the 
duration. Let 


P(t, z, b) rz PripN,; < t} 
be the distribution function of the random variable »N. From Theorem 3 we 


conclude by means of the continuity theorem for moment generating functions 
(see [3]) that there exists a distribution function Q(t, ) of ¢ such that 


(3.1) lim P(t, z, wu) = Q(t, x) 


u—-0 


at all continuity points of Q(t, x) with respect to t, and that u(s, x) is the mo- 
ment generating function of Q(t, x). 
The probability 


(3.2) P(t,x,n) = D> pilz) 
kst/p 


satisfies, because of (1.2), the recursive relations 


P(t + n, 2, ») / P(t, y, u) dF(y, 2, u), ze B, t = 0, w, 2p, --- 
4 


(3.3) P(0,2z, ») = 0, 2 é¢ 3, 
P(t, z, uw) = 1, zeE —B, 





RANDOM WALKS 209 


From these and (3.1) it is easy to obtain, in a purely formal way, the differential 
equation (3.5) for Q(t, x). The same result can be made plausible by setting 


(3.4) u(s,z) = I ‘ e" dQ(t, x) 


in (2.6) and operating formally on the Stieltjes differential. Our aim in this section 
is to give a proof of (3.5). In spite of the plausibility of the result the proof is 
somewhat long, because the problem combines the features of what Khintchine 
calls, in [4], the first and second diffusion problems. 

A feasible approach to our problem, different from that of this paper, could 
be based on the remark that u(s, 2) and therefore Q(t, x) depend only on the 
functions a;(z), 8x%(x), so that F(y, x, u) can be chosen in many different ways 
as far as the properties of Q(t, x) are concerned. (This is an instance of the 
“invariance principle’, used systematically, in a different context, by Kac 
and Erdés, cf. [5].) The most natural choice for F(y, z, u) is the one obtained 
from the continuous Markov process associated with the differential equation 
(3.5) by considering that process at discrete time values ¢ = 0, yu, 2u, --- , only. 
This approach has not been chosen here, first, because we wish to preserve 
uniformity of method, and secondly, because the theory of such Markov proc- 
esses does not seem to have been established in sufficient completeness for the 
n-dimensional case. (In one dimension, the proof that such a continuous process 
exists was given by Feller in [12]. This has been partially generalized to n dimen- 
sions by Dressel [6]. The proof that the duration of this continuous process 
satisfies the differential equation of (3.5) in the one-dimensional case is con- 
tained in the article [13] by Fortet.) 

TueoreM 5. If the assumptions B, L, E, and S are satisfied, then 


lim Pr{uN, < t} = Q(t, x) 


uO 


exists and is the solution of the differential equation problem 


119) - 8 =o, t>0, <zeB, 


(3.5) Q =1, t> 0, zonC, 
Q = 0, t = 0, zeB. 


The convergence is uniform in B +- C, for any interralO << t<h. 
The proof of this theorem is based on two lemmas. 
Lemma 4. The distribution function Q(t, x) is a continuous function of all 
arguments combined, fort > 0,2 ¢ B+ C,and fort = 0,2 e B. 
Proor. Let 
® 
(3.6) g(t, z, T) = — u(is, x)e~"* ds. 
2T Lr 





210 WOLFGANG WASOW 


It is known that this function is real and tends to a limit as T — ©. For fixed 
x the distribution Q(t, x) is continuous in ¢ if and only if 


(3.7) lim g(t, x, T) = 0. 


T-2 
(See, e.g., [7], p. 24, for these statements.) Also, 
(3.8) | g(t, 2, T)| < 1, 


since u(is, x), as a characteristic function, is numerically less than one. By 
(2.6) the function g(t, x, T) satisfies for all T the differential equation 


(3.9) Lig - 2% =0, xeB, —~0 <t<». 
ot 

From (3.8) and (3.9) we can conclude (cf. [8], p. 383-384) that dg/dt is uni- 
formly.bounded for all T in any finite ¢-interval and for z in any closed subdomain 
of B. Therefore the limit of g, as T’ — ©, is a continuous function of ¢. On the 
other hand, since Q(t, x) as a distribution function has at most a denumerable 
set of discontinuities, limy_.. g is zero for fixed x, except possibly at a denumerable 
set of t-values. Being continuous, the limit of g must therefore be zero every- 
where in the domain considered, i.e., Q(t, x) is for all x in B and for all ¢ a con- 
tinuous function of ¢. (The result of Gevrey [8], referred to above, is proved in 
that paper only for differential equations whose second order terms form 
Laplace’s operator. A generalization sufficient for our needs can be established 
by combining Gevrey’s arguments with the results of [6].) In EZ — B the distri- 
bution functions P(t, x, 1)—and therefore Q(t, x), their limit as » — O— are 
identically 1 for ¢ > 0. Hence Q(t, x) is, for ¢ > 0, a continuous function of ¢ 
in the closed domain B + C. 

To prove that Q(t, x) is continuous in z also, it suffices to remember that 
u(is, x), its characteristic function, is a continuous function of z at s = 0. By 
the continuity theorem for characteristic functions ({7], p. 30) the correspond- 
ing distribution function Q(t, x) is therefore continuous in z for all t > 0. The 
continuity is uniform with respect to x in every continuity interval of ¢ (({7], 
p. 31) and therefore Q(t, x) is continuous in ¢ and x combined for ¢t > 0 and 
z in B + C, as well as for ¢ = 0, x ¢ B. This proves the lemma. 

Coro.uary. The convergence of P(t, x, u) to Q(t, x), as up — 0, is uniform for 
@. in B+ Cand0 < tb < t < t,. For, by a similar argument to that used in 
the preceding paragraph, it is seen that P(t, x, uw) is a continuous function of 
all arguments combined at » = 0, and this implies uniform continuity in the 
designated domain. This proves the last sentence of Theorem 5. 

LemMa 5. Let u(x, pw) satisfy the recursive relations 


Unga, mw) = | u(y, w) UF (y, x, w) + pax(z, wu), 
(3.10) 


MELT, u) = b, (x. u), 





RANDOM WALKS 211 


and let | ax |, | be |, and | uo| be less than a constant M. Then, if Assumptions B, 
L, E, and S are satisfied, the inequality 

| u(x, u)| < C-M 
holds, where C is a constant independent of M. 

Proor. Assume, at first, that b, and w are identically zero, and that qa = 
1/u. We denote the solution for this special case by us . It was proved in [1], 
Lemma 2, that uz(x) tends monotonically to the first moment M,(z) of Nz, 
as k — o. From Theorem 4 we know that lim,.5 uMi(x) = m,(zx), uniformly in 
B. Hence, 0 < ue < C/u. 

Next, we drop the assumption a, = 1/y and call the solution of the integral 


relation (3.10), in that case, ui”. Then the function ur* = uMuz — uj,” solves 
the problem 


t= [ uf* dF + u(M—a) in B, 
E 


ue’ =0 in E—B,w=0 in B. 


Since M — a, > 0, it follows that ug* > 0 in B for all k, i.e., uf” < const. M. 


The inequality —u{” < const. M is proved analogously. Thus the lemma is 


proved in this special case. 

Now we take the solution uj” (a, u) of the special case that a, = uw = 0. 
Here we obtain immediately by recursion the inequality | u{?(x, u)| < M. 
The solution u{” (ax, ») of the special case a, = b; = 0 also satisfies trivially the 
inequality | us (x, u)| < M. 

Since the solution in the general case is the sum of three solutions corre- 
sponding to the three special cases, the lemma is proved. 

Proor or THEOREM 5. Instead of comparing P(t, x, ») directly with the 
solution of (3.5) we introduce the solution v of the problem 


(3.11) Lio] - 5 = 0 in B, t>0, eB, 


(3.12) v(0,7) = Q,z), xreB, 
(3.13) v(t, x) = 1, zeK— B. 
By this device we avoid difficulties connected with the discontinuity in the 
boundary conditions in (3.5) att = 0,2e«C. 
As in the proof of Theorem 3 we replace v(t, x) by the function 
v;(t, x) defined by 
w(t’,2’)=v(t,7), weB’, t>-s, 
v(t, x) = 0, xzeK — B, 
where 2’ is the function of z and 6 introduced in Section 2, and 


(t for t> 6, 
(3.14) = 


t=— Gan fr OS tsk 





212 WOLFGANG WASOW 


This function v;(z, t) possesses continuous second derivatives with respect to 
the x; and t for z ¢ B” and t > —®’. It is therefore possible, as in Section 2 and 
in [2], to apply Taylor’s formula with quadratic terms to v(t, y). An applica- 
tion of Assumptions B and J yields, similarly as in [1] and in the proof 
of Theorem 3, 


8.15) ff wilt, y) dF(y, 2, w) = vilt, 2) + uLlen) + ugrlt, 24,8), eB, 
E 


where the function g; has the property that for every « > 0,5 > 0,4 > 0,a 
uo > O can be found such that 


(3.16) lat,z,n,5|/S¢ weB+C, O<t<4h,  < Moe 
Now by the definition of v; it is possible to choose 6 so small, independently 
of the value of yu, that 


Ov 
Livs] = Hr + galt, qt, 6), 


where ge(t, x, 5) satisfies the same inequality as g,(t, z, u, 6). Hence 

(3.17) wL{vs] = vs(t + wu, 2) — vs(t, x) + galt, x, uw, 8), 

where, for a certain positive 4; < uo, depending on 6 and e, 

(3.18) gs(t, x, u, 6) < 2e, zeB+C, O<t<ht, usm. 
Combining (3.17) and (3.15) we find 


(3.19) u(t + 4,2) = [ v(t, y) dF(y, x, wu) + wh(t, 2, u, 5), 
g 

where 

(3.20) | h(t, x, w, 6)| < 3e, zreB+C, O<t< &, 

Subtraction of (3.19) from (3.3) yields for 

(3.21) w(t,r7,u) = P(itt+ h&,2, nu) — v(t, x) 


the integral equation problem 


(3 22) w(t + B, @, bh) - | w(t, Y; bu) dF (y, z, M) + uhit, Z, My, 6), 
Oo. Bt 


22 B, 
(3.23) w(t, 2, uw) = w(t, 2, w), zeE — B, 


(3.24) w(0, z, n) = w(x, n). 
Here 


(3.25) | wilt, 7, w)| = | P(t + 6,2, 4) — v(t, z)| =| 1 — w(t, 2)| < « 





RANDOM WALKS 213 


for ze E — B,0 < t < t, provided 6 has been chosen sufficiently small (in- 
dependently of u). This follows from (3.13) and the continuity properties of 
v3(t, x). Similarly we have, if 6 and yw; are, independently of each other, chosen 
sufficiently small, 


| we(z, »)| - | P(b » 7, M) = v3(0, x)| < | P(t, z, ) ape Q(to ? x)| 
+ | v(0, xz) — »(0, z)| < «, zeB+C. 


(3.26) 


If we set ¢ = kyu and write 
w(ky, z, h) - u(x, H), 


we can apply Lemma 5 to formulas (3.21) to (3.26) with the result that, if 6 is 
sufficiently small, 


| P(t + t, 2, uw) — w(t, z)| < 4Ce, 


forzeB+C,0<t<t,andu < mw. 

Finally, since v, differs arbitrarily little from v for sufficiently small 6, and « 
was arbitrary, it follows that Q(t, xz) = lim,.o P(t, z, u) is, for alla ¢ B, the solu- 
tion of the differential equation problem (3.11) to (3.13). By Lemma 4, Q(t, z) 
approaches its values on C and its initial values for ¢ = 0, as x approaches C 
or t — 0, respectively, and is, therefore, indeed the solution of problem (3.5). 


4. Some applications. If L{u] is self-adjoint, then the solution of (2.6) can be 
calculated in the usual way by expansion in terms of the orthonormal eigen- 
functions u,;(z) of L{u] + Au = 0, corresponding to the eigenvalues A = A, , 
which are all real and positive. To do this we set u = w + 1 in (2.6) and solve 
the resulting problem 


L{w] + sw = — sin B,v =O0onC, 


by the standard methods. (Cf., e.g., [9], p. 312. The argument for ordinary 
differential equations given there can be extended to partial differential equa- 
tions whenever the existence of Green’s function is known.) We find 

u(s,xz) = 1+ w(s,7) =1+ > 


=1+ > (4, ~ )f uj(y) dy-u;(z). 


j=l 


[ ule) dy-ta(2) 
(4.1) 


The series 3, [ u;(y) dy-u;(x) is the generalized Fourier series of the function 
B 


that is identically one in B. If this series actually converges to 1 in the interior 
of B, formula (4.1) simplifies to 


(4.2) u(s,z) = > _* / uj(y) dy-u;(x), xeB. 


j=l d; =-¢ 





214 WOLFGANG WASOW 


From here on we assume explicitly that (4.2) is valid: 
AssuMmPTION C. The series Zoe [ u(y) dy-u;(x) converges for all x ¢ B. 
B 
In this case we can give an explicit expression for the distribution function 


Q(t, x), for 


u(s,2z) = [ dts: » | uj(y) dy-u;(x)e** dt, 
B 


0 j=l 


and therefore, because of (3.4), 


(4.3) Q¢,2) =1- 2D fw) dyad, 


This proves 
TueoreoM 6. If the Assumptions B, L, E, S, and C are satisfied, if Lu] is self- 
adjoint, and tf the lowest eigenvalue d; of L{u] + Au = 0 is simple, then 


(4.4) Pr{N, > k} = w(x) / u(y) dy-e*™*+ O(e° >?) + alk, 2, w), 
B 
where 


lim a(t, z, wu) = 0 
uO 
uniformly in t and x. 

The leading term in (4.4) is thus a good approximation to Pr{N, > k} ina 
range of the variables uk for which ky is so large, and at the same time yp so 
small, that the two remainder terms can be neglected. 

The preceding calculations have some points of contact with those of M. 
Kac in [10], Section 10. The results there refer to the special case of Brownian 
motion. Also, an integral equation is used instead of (2.6), which permits a 
considerable relaxation of the condition S. 

As a special application we consider random walks for which L[{u] reduces to a 
constant multiple of the Laplacian. It can then be assumed without loss of 
generality (see [1], Section 4) that » is the mean square of the step length and 
that 


1 
(4.5) L{u] = 5, Au. 
2n 
A domain B for which all quantities involved can be calculated explicitly is the 


n-dimensional sphere of radius a with center at « = 0. A routine calculation 
leads to the formula 


1—n/2 
(4.6) (3,2) = (*) J nto 2nsr) /J npo—1(+/2nsa) 


for the moment generating function. Here r = + we x;)’ and J;(z) is Bessel’s 





RANDOM WALKS 


function of the first kind. The series for u(s, x) in powers of s, 


u(s,z) =1+ (@ —r')s+ Sed [(n + 4)a® — nr'Js’ + «> 


and an application of Theorems 3 and 4 lead to the expressions 
lim pE[N,] = 
u—0 
9 


lim w’ EN; — E[Nz]] (a* — r'‘) 


490 n+2 
for the mean and the variance of the duration. The relative error e[N,], i.e., 
the standard deviation of N, divided by its mean value, satisfies therefore the 
relation 


2 esl 23 ,f/é+7 
(4.7) lim e[N.] = —s V a." 


It should be noted that the relative error is a decreasing function of the number of 
dimensions. 

We omit the straightforward calculation needed for the determination of the 
eigenfunctions u,;(x) and eigenvalues \; in the present case and state only the 
results: 

Let p = p; be the jth positive zero, in order of increasing size, of the function 
J nj2-1(p); then 

A; = p;/2na’. 
Assumption C is satisfied (ef., e.g., [11], p. 591) and 


1—n/2 
") cn J npo—1 (0; r/a) 7 ?jt/2ne* 


(4.8) Q(t, x) =1-2 (: jul pid aj2(p;) 


a 


For n = 2, we find, e. g., using the approximation (4.4), for small u, and Theorem 
4 with k = 1, 
(4.9) Pr{No > 2E(Noj} ~ 1/11. 
REFERENCES 
[1] W. Wasow, ‘On the mean duration of random walks,’’? Jour. Research Nat. 
Bur. Standards, Vol. 46 (1951). 
| I. Perrowsky, ‘Ueber das Irrfahrtproblem,’’ Math. Ann., Vol. 109 (1934), pp. 425-444. 
| J. H. Curtiss, “On the theory of moment generating functions,’’ Annals of Math. 
Stat., Vol. 13 (1942), pp. 430-433. 
[4] A. Kurntcaine, Asymptotische Gesetze der Wahrscheinlichkeitsrechnung, Julius 
Springer, Berlin, 1933. 
5| P. Erpés anp M. Kac, ‘‘On the number of positive sums of independent random 
variables,”’ Bull. Am. Math. Soc., Vol. 53 (1947), pp. 1011-1020. 
6] F. G. Dressst, ‘The fundamenta! solution of the parabolic equation II,’’ Duke Math. 
Jour., Vol. 13 (1946), pp. 61-70. 


19 
> 
Lo 





216 WOLFGANG WASOW 


[7] H. Cramér, Random Variables and Probability Distributions, Cambridge University 
Press, 1937. 

[8] M. Gevrey, ‘‘Sur les équations aux dérivées partielles du type parabolique,”’ Jour. 
Math. Pures Appl., Vol. 9 (1913), pp. 305-471. 

[9] R. Courant anv D. Hitpert, Methoden der Mathematischen Physik, Vol. 1, 2nd ed., 
Julius Springer, Berlin, 1931. 

[10] M. Kac, ‘On some connections between probability theory and differential and in- 
tegral equations,”’ to be published. 

[11] G. N. Watson, Theory of Bessel Functions, 2nd ed., Cambridge University Press, 
1948. 


[12] W. Fevuer, ‘Zur Theorie der stochastischen Prozesse,’’ Math. Ann., Vol. 113 (1937), 
pp. 113-161. 
[13] R. Forrest, ‘Les fonctions aléatoires du type de Markoff associées & certaines équations 


linéaires aux dérivées partielles du type parabolique,’’ Jour. Math. Pures Appl., 
9th series, Vol. 22 (1943), pp. 177-244. 





ON THE THEORY OF UNBIASED TESTS OF SIMPLE STATISTICAL 
HYPOTHESES SPECIFYING THE VALUES OF TWO OR 
MORE PARAMETERS! 


By Sranuey L. Isaacson 


Columbia University and Iowa State College 


1. Summary. Unbiased critical regions of type D for testing simple hypotheses 
specifying the values of several parameters are defined and their properties 
studied. These regions constitute a natural generalization of the Neyman- 
Pearson regions of type A for testing simple hypotheses specifying the value of 
one parameter. A theorem is obtained which plays the role of the Neyman- 
Pearson fundamental lemma in the type A case. Illustrative examples of type 
D regions are given. 


2. Introduction. The parameter space © will, in this paper, be a subset of a 
k-dimensional Euclidean space (k > 1), and 6 = (@,,---, 6) will denote a 
point in 2. A simple statistical hypothesis is one which specifies the values of 
all unknown parameters. When we refer to a statistical test we mean a Borel 
measurable set in an n-dimensional sample space such that if the sample point 
falls in this critical region we reject the null hypothesis. In this paper the term 
region will always mean a Borel measurable set. The probability of rejecting a 
true hypothesis when using a given test is called the size of this test. A test is 
unbiased if the power function of the test has a relative minimum for the value 
6 = 6, where @ is the value of @ specified by the hypothesis to be tested. 

A locally best unbiased region for testing a simple hypothesis specifying the 
value of one parameter is called type A by Neyman and Pearson [1]. It is ob- 
tained by maximizing the curvature of the power curve at the point 6 = @ 
specified by the hypothesis, subject to the conditions of given size and un- 
biasedness. Geometrically speaking, the power curve of a region of type A is 
above the power curves of all other unbiased regions of the same size in an 
infinitesimal neighborhood of @. For the purpose of generalization to the k- 
parameter case it is useful to note that if we consider the power curve of the 
type A region and the power curves of any other unbiased regions of the same 
size, then the length of a horizontal chord at a fixed infinitesimal distance above 
the minimum point is a minimum when compared with the length of this chord 
on the power curves of the other unbiased regions of the same size. We note 
that the definition of type A regions does not use any information about the 
relative importance of errors of type II. (An error of type II is made when we 
accept a false hypothesis.) 

Type A regions remain invariant under transformations of the parameter 
which are locally one-to-one and twice differentiable. Regions of type A can 


1 Work done under the sponsorship of the Office of Naval Research. 


2i7 





218 STANLEY L. ISAACSON 


be proved to exist under quite weak assumptions on the joint density function 
of the sample. 

When we proceed to consider simple hypotheses specifying the values of 
two or more parameters, we are immediately faced with a more complicated 
situation. (For the sake of simplicity in statement, we confine ourselves in this 
introductory section to the two-parameter theory; the extension of our dis- 
cussion to three or more parameters is direct.) In the two-parameter case the 
geometrical picture of the power function is a surface, and if we require of a 
locally best unbiased region that its power surface have maximum curvature 
along every cross-section at the point (@,, 62) = (61, 63) specified by the hy- 
pothesis, subject to the conditions of size and unbiasedness, then it develops 
that this requirement cannot be met even in the simplest cases; for if we maxi- 
mize the curvature of the power surface along one of its cross-sections, we find 
that in general this causes the curvature to diminish along other cross-sections 
and so we cannot maximize the curvature along all cross-sections at once. 

To handle the two-parameter theory, Neyman and Pearson [2] considered 
type C regions. They require of a critical region not only that it be of given 
size and unbiased but also that it have constant power in an infinitesimal neigh- 
borhood of (6{, 62) along a given family of concentric ellipses with the same 
shape and orientation; the type C region is then defined as the one among this 
class of regions which gives best local power. When the given family of ellipses 
consists of circles, the region of type C is called regular; otherwise it is called 
nonregular. One can choose the family of ellipses if and essentially only if one 
knows the relative importance of errors of type II in an infinitesimal neighbor- 
hood of (6{ , 62). In the absence of such information one cannot proceed to find 
a region of type C. Regions of type C retain their property of unbiasedness under 
transformations of the parameter space which are locally one-to-one and twice 
differentiable, but in general regular unbiased critical regions of type C become 
nonregular under such transformations. Hence if one is inclined in the absence 
of advance information about errors of type II to favor the regular unbiased 
region of type C as a region fulfilling “good” intuitive requirements, then the 
objection can be raised that these regular regions of type C are not invariant 
under transformations of the parameter space. 

There is an approach to the problem of finding a “‘good”’ critical region which 
overcomes the objections raised to the type C theory; i.e., it will provide us 
with a criterion for choosing a critical region without using any advance knowl- 
edge as to the relative importance of errors of type II, and this type of critical 
region will be invariant under transformations of the parameters. This type of 
critical region, which will be a natural generalization of the type A region of the 
one-parameter theory, will maximize the Gaussian curvature of the power 
surface at (6; , 02) = (6! , 62), subject to the conditions of size and unbiasedness. 
In the next two sections we shall develop this theory for simple hypotheses 
specifying the values of two parameters, and then in Section 5 we shall extend 
it to the case of simple hypotheses specifying the values of three or 
more parameters. 





UNBIASED TESTS 219 


3. Definition of unbiased critical regions of type D in the two-parameter case. 
We introduce for the power function of a region w the symbol 


(3.1) | (0; , 02|w) = Pr(E e w| 6, 6), 


where E = (x, +++ , 2») is the sample point in an n-dimensional sample space. 
Here the joint probability distribution of the sample depends on the param- 
eter 6 = (6,, 62), and we are testing the hypothesis (@,, @:) = (6, , 02). We 
make a translation of the parameter space to bring the point (6{, 62) to the 
origin, so that we may consider the test of the hypothesis (6; , #.) = (0, 0). The 
size of the criticai region is then 


(3.2) B(0,0|w) = Pr(E e w|0, 0). 


We also introduce the following notation: 


(3.3) Ba(we) = BG, 10) ; 
i (0; ,62)=(0 0) 


(pn) — 2 Br, & | w) 
(3.4) B.;(w) 00; 36; (01.02)—(0,0) ° 


We assume these derivatives exist. We shall write 8; and 8;; for 8;(w) and 6;;(w)? 
respectively, whenever our doing so will cause no ambiguity. We note that the 
derivatives are taken at (6; , 62) = (0, 0), though this fact does not show up in 
our notation. : , 

In books on differential geometry, such as Eisenhart [3], it is shown that if 
we consider a surface in three-dimensional Euclidean space and a point (xo , yo) 
at which the second partial derivatives of the function z = f(x, y) which de- 
scribes the surface exist and are continuous, then the so-called Gaussian or 
total curvature K of the surface z = f(z, y) at the point (2, yo) is given by: 


ats] as [4 | T 
(3.5) Pe Ox? |(z9.ve) oy’ \(z9.¥o) Oxdy (z0.¥o) 


PEL) 
(1 , E a * OY |(20.v0) 


The Gaussian curvature is invariant under translation and rotation of the co- 
ordinate axes. Applying (3.5) to the power surface 8 = 8(6; , 6: | w) at the point 
(6; , 62) = (0, 0), imposing the condition of unbiasedness on w, and noting that 
necessary conditions for an unbiased region are that 6:(w) = 6.(w) = 0, we 
have 


(3.6) K = Buslte)Bax(w) — Bixw) _ | Bulw) Bel) | aot B,. 


Q+04+0)? — | Bir(w) Beo(w) 


where 


Es Bu(w) aa 
Be = (Bi Be2(w) } * 





220 STANLEY L. ISAACSON 


As a natural generalization of the type A region of the one-parameter theory, 
we now propose as a critical region for testing (6, 4) = (0, 0) that critical 
region which maximizes the Gaussian curvature of the power surface at (0, 0), 
subject to the conditions of size and unbiasedness. This leads us to the following 

DEFINITION. A region wo is said to be an unbiased critical region of type D for 
testing Ho if: 

I. B(0, 0 | wo) = 

Il. Bi(wo) = 0, += 1,2; 

Ill. B,,, is positive definite; 

IV. det B,, > det B. for any other region w satisfying I-III. 

Condition I specifies the size of the test. Conditions II and III insure the 
existence of a relative minimum at (6; , 6.) = (0, 0) and so imply the condition 
of unbiasedness. Condition IV specifies that the region of type D has maximal 
Gaussian curvature among all unbiased regions of the prescribed size. 

Let us consider the geometrical interpretation of a region of type D. In the 
one-parameter theory we noted that the type A region minimizes the length of 
a certain infinitesimal chord on the power curve. We shall now see that the 
type D region minimizes the area of a certain infinitesimal ellipse, subject to 
the conditions of size and unbiasedness. Consider a Taylor expansion of the 
power function in an infinitesimal neighborhood of (@, , 6.) = (0, 0). We have, 
neglecting infinitesimals of the third and higher orders, 


8(6, , 2| w) = B(0,0| w) + O61 + 6282 + (61811 + 26:028;2 + 63822) 
(3.7) 
=er 3 (0:81 + 20:028:2 + 6382). 


Consider the ellipse 6{8:, + 26,026:2 + 03622 = 6, where 6 is a positive constant; 
this ellipse is a horizontal cross-section of the power surface at an infinitesimal 


distance above the minimum point (6, , 6) = (0, 0). It is well known that the 
area of this ellipse is given by 


(3.8) il i: a ll 
; Bis Bis Vdet B’ 
Biz Boe 
We have just seen that the region of type D maximizes the determinant of B 
subject to the conditions of size and unbiasedness. Hence it minimizes the area 
of our infinitesimal ellipse subject to these same conditions. 


4. Theorems concerning regions of type D in the two-parameter theory. 
Having defined regions of type D, we now wish to obtain a theorem which will 
characterize the structure of such regions for us. We shall assume the following 
fundamental condition is satisfied: 

There exists a joint density function p(E | 6, , 62) for any point (6; , 62) in the 
parameter space 2; and for any fixed region w in the sample space the integral 


/ p(E | 0;, 6) dE has second partial derivatives with respect to 0, and 6, in a 





UNBIASED TESTS 221 


neighborhood of (6; , 62) = (0, 0) which are continuous at (0, 0), and the integral 
can be differentiated twice under the integral sign with respect to 6; and 6, at (0, 0). 

The derivatives of the above types taken at (6; , 6) = (0, 0) will be denoted 
simply as follows: 


0 B 7 1 7, 7 
(4.1) 5 | P(E | 61, 62) dE \p,-6,—0 = [ pidE = 8,(w), 
3 


(4.2) 
00; 00; 


| p(E | 6, 6) dE leeseuo = [ pydE = Bylw), 4,j 
where 


(4.3) , a SPLE | 1, 6) _ _ & p(E |b, 6) | 


06; 0:—0,=0 + 06; 06; \0—02—0. 


We also write p(Z | 0,0) = p. 
We seek a theorem which will tell us how to characterize the structure of a 


2 
region wo such that [ gu dE [ g2ndE — (/ giz az) is a maximum, subject 


to the side conditions that [4 dE = ¢,% = 1,--+, m, where the g;; and 


the f; are given integrable functions and the c; are given constants. If we have 
such a theorem, then by taking m = 3, gu = pu, gi = Pw, G2 = Pu, fi = PD, 
fe = ti, fs = po, Ci = a, Ce = 0, cs = O, we will be able to use the theorem to 
characterize the structure of a region of type D, since in terms of the p’s our 
conditions on a type D region are: 


I’. [ pdadE =a; 
wo 


II’. ! pdE = 0, 
wo 


Ill’, The matriz P., = (/ Dij az) »%,J = 1, 2, is positive definite; 
wo 


IV’. det Py, > det Py for any other region w satisfying I'-III’. 


We will now state the Neyman-Pearson fundamental lemma, which is used 
in the one-parameter theory to find regions of type A, in order to indicate the 
type of theorem we are seeking and also because we shall use this lemma in 
proving our theorem. 

THe NEYMAN-PEARSON FUNDAMENTAL LEMMA. Suppose m + 1 given in- 
tegrable functions fo, fi, +--+ 5 fm are defined in an n-dimensional space. Consider 
the set of all regions w for which the following conditions are fulfilled: 


(4.4) [ hak =a, 





222 STANLEY L. ISAACSON 


where the c; are m given constants. If wo is a region which satisfies the m conditions 
(4.4) and if 


> Db kif 


(4.5) a 
fo <= Di kif; outside wo, 


t=1 


for m suitably chosen constants k; , then wo has the property that 


(4.6) [ fodE > | fodE 


w 


for any region w which satisfies (4.4). 
We proceed to state and prove a lemma which will tell us how to characterize 


&@ region Wo maximizing / gu dE [ ge @E subject to integral side conditions, 
and then to use the lemma and a corollary to it to prove a theorem which will 


characterize the structure of a region w) which maximizes [ gu dE [ J2 dE — 


(/ Giz ae) subject to integral side conditions. 


Lemma 1. Suppose m + 2 given integrable functions gy , goo, fri, ++: , fm are 
defined in an n-dimensional space. Consider the set of all regions w for which the 
following conditions are fulfilled: 


(4.7) / {,d8 = &, 


(4.8) | gi dE > 0, i= 12 


where the c; are m given constants. If wo is a region which satisfies conditions (4.7) 
and (4.8), and if 


> kg 2 7. kif; in wo, 
i=l i=l 


(4.9) , : 
Zz kis Jai Zz kif; outside w, 
t=] 


tl 


where ky = [ gz dE, ke = [ gu dE, and the k; are m suitably chosen constants, 
wo wo 


then wo has the property that 


9 


(4.10) II [ ii dE 2 II | 933 dE 


j=l j=l 


for any region w which satisfies (4.7) and (4.8). 





UNBIASED TESTS 223 
We note that we must know our region wy in advance so that we can cal- 


culate ky and kx. and thus verify whether w) has the structure required by the 
lemma or not. 


Proor. We apply the Neyman-Pearson fundamental lemma to the function 


h= ~ kigiu = ou [ gx dE + on | gu dE. 
wo wo 


t=] 


From (4.6) we obtain 


(4.11) / gu dB [ on dE + [ ra az | gu dE < 2 | ou az [ Qn dE 
w wo w wo wo wo 


for any region w satisfying (4.7). Knowing (4.11) we must prove that 


(4.12) [ on aE [ gu dE < [ ou aE [ gus dE 
w w wo wo 


for any region w satisfying (4.7) and (4.8). 
Let 


(4.13) ; = , j= 1,2. 


Since the integrals [ 91:48, / 91; 4B, j = 1, 2, are all positive by (4.8) we 
w wo 


may rewrite (4.11) and (4.12) in terms of the 2;’s as follows: 
(4.14) t+ a < 2, 


(4.15) U2L2 a 


Thus we must prove that 2%, < 1 whenever }(2 + 22) < 1, where x and 2 


are positive real numbers. But this follows immediately from the well known 
inequality between the arithmetic and geometric mean, and hence our lemma is 
proved. 


Coroutuary. If a region wo satisfies conditions (4.7), (4.8), and (4.9) of the 


lemma, and if gi2 is a given integrable function for which / g2dE = 0, then 
wo 


; 2 
| gu dE / gu dE — (/ 9x2 an) 
wo wo wo 
2 
> [ gu az f 922 dE = (f giz az) 


for any region w satisfying conditions (4.7) and (4.8) of the lemma. 


(4.16) 





224 STANLEY L. ISAACSON 


We now use the lemma and the corollary to it to prove the following theorem: 
THEOREM 1. Suppose the elements g;; of a symmetric 2 X 2 matrix 


am (o ~) 
j= 
J21 G22 


are given integrable functions defined in an n-dimensional space; and that fi,--- , 
fm are m other given integrable functions defined in this space. For any region w, let 


if gu az f on ak | 


|, on aE i om aE} 


Consider the set of all regions w for which the following conditions are fulfilled: 


(4.17) [fu dE = c, f= l+e+,m, 


G. « 


(4.18) G. 18 positive definite, 


where the c; are m given constants. If we is a region which satisfies the conditions 
(4.17) and (4.18), and if 


2 


x kis gis > a kifs in wo, 
j= i-_ 


(4.19) 


2 


D kigu < DL kif: outside wo, 


t, j=l i=l 
Glin le @ [ on dE, ba = [ a, a [ ou GE, and the k, are 


m suitably chosen constants, then wo has the property that 
(4.20) det Gy, > det G. 


for any region w which satisfies (4.17) and (4.18). 
Proor. We know there exists an orthogonal matrix H of constants which 
diagonalizes G,, ; that is, H’G.,H is a diagonal matrix, and H’H = I. Apply 


this transformation to 
oO é ~ ~ 
gu 922 


and let 


o * 
G* = (% a) = H’GH. 


a 
J21 922 





UNBIASED TESTS 


We note that 


( 7 ) 
'[ ohae 0 | 
(4.21) G.. = | = H’'G,,H. 


0 [. oh dE) 


Since H is orthogonal, we know that det G,, = det GS, , and also det G, = 


det G* , where 
‘ 
if gin aE | giz ax 


/ gai az | g22 | 
\ow w 


Thus we see that if det G2, > det G2 for any region w satisfying (4.17) and 


(4.18), then det G,, > det G,. for any such region, and this is what we seek to 
prove. But since G,, and G,, are positive definite, we know that Gi, and Gt are 
positive definite; hence their diagonal elements are positive and they satisfy 
condition (4.8); then by our lemma and its corollary we know that det Gt. > 


det G% for any region w satisfying (4.17) and (4.18) (and hence (4.8)), if we 
satisfies 


1* 
gs 


= H'G,H. 


oh | gx dE + oh | gu GE > a kifi in w, 
(4.22) wo wo tm 


oh gi: dE + gr [ gn dE < 2 kf; outside wm. 
we wo toe 


It now remains only to prove that the conditions (4.22) are implied by (4.19). 
To do this we shall prove that 


oh | ge dE + gi | git dE = Ji [ 922 dE + G22 [ gu dE — 20 | 9Ji2 dE. 
woe wo wo wo wo 


Denote the adjoint of a matrix A by adj A. Then (adj G,)G* = 
H'(adj G,,.)HH'GH = H'(adj G.,)GH, since H is orthogonal. But 


ou | 922 dE + on [ gu dE — 2on | Jr dE 
wo wo wo 


is the trace of (adj G.,,)G and similarly 


ot | gz. dE + gas [ gin dE 
wo wo 


is the trace of (adj G2,)G*. Hence our two expressions are equal, as we know 
that the trace of a matrix is invariant under an orthogonal transformation, and 
we have just seen that (adj G2,)G* is obtained from (adj G..,)@ by such a trans- 
formation. This completes the proof. 





226 STANLEY L. ISAACSON 


We shall now prove a result we mentioned earlier; namely, the invariance of 
regions of type D under transformations of the parameters. 

THEOREM 2. If the transformation 6; = T;(@;, @2), i = 1, 2, is such that the 
first and second partial derivatives 30,/80; and 3°0,/8Q,00; exist and are contin- 
uous at (Q; , O2) = (0, 0), i, 7, s = 1, 2, the Jacobian (6; , 62)/8(@1 , Ox) differs 
from zero at (@;, Q2) = (0, 0), and (0, 0) maps into (0, 0); then a region uo, 
which is an unbiased critical region of type D for testing (0: , 62) = (0, 0) against 
the set of alternative hypotheses specifying the values of the parameters 6, and #2, 
will remain an unbiased criiical region of type D for testing (Q,, @2) = (0, 0) 


against the set of transformed hypotheses specifying the values of the new param- 
eters ©, and @,. 


Proor. We adopt the following notation: 


96; | K 06 | 

00); |e:-e2=0 , 90>» |e, 

06, | 

90+ |@,-02=0 


(4.23) : 
20 | 


| a NW, 
00: |e:-e2~0 


By hypothesis the determinant of 


K L 
a - o) 
is not equal to zero. We denote by 6, and 6; the partial derivatives of the 
power function with respect to 0; and ©; evaluated at (@,, 02) = (0, 0). 


Also let 


B neon. ee) 
ass Bar)(w) Ben (w)/° 


Then we can write 
Bay _ Aik + 8.M, 
Be, = fil + BN. 


The condition that 8(0, 0| wo) = a@ is unchanged by the transformation of 
parameters. Since we know that for an unbiased region 8; = 0 and #, = 0, we 
obtain from (4.24) that Bg) = 0 and Bw) = 0. Thus, since the partial derivatives 
of the transformation are continuous, our property of unbiasedness is retained. 
Also since 6; = 0 and £2 = 0, it is easily seen that 


(4.25) Buy = J'B,.J. 


(4.24) 


Since J is nonsingular by hypothesis, we know that By) is positive definite 
since B, is a positive definite matrix. Also we have that 


(4.26) det By) = (det J)’ det B, ; 
and since det J ¥ 0, it follows that if det B,, > det B,, then det By.) > 





UNBIASED TESTS 227 
det Bw). Thus we have seen that uw satisfies all the conditions of a region of 
type D for testing (@; , @.) = (0, 0), and our proof is completed. 

The inequalities which must hold within and outside the unbiased critical 
regions of type D can frequently be simplified if we express them in terms of 
the derivatives of log p(E | 6; , 02). We write 


_ 9 log p(E | &, 62) | 
06, |(01.02)—(0.0)" 


_ & log p(E | 41, 62) | 
06,008, 1(@1,02)—(0,0). 


d: 
(4.27) 


ts 
where ¢, s = 1, 2. In particular, the simplification will be considerable if 
(4.28) ou = A te + Buds “+ Cudr, 


where A;,, Biz, Crs are independent of the sample point E but may depend on 
(0, , 62). If (4.28) is true, it will be seen that 


(4.29) a = dip, ie dep, 
(4.30) Pts = (bibs + Ate + Besti + Cee). 


Consequently, the type of inequalities (4.19) occurring among the sufficient 
conditions of Theorem 1 for a region of type D will reduce to the following for 
points where p > 0 (assuming that W,(@:, 6) = {E | p(E|01, &) > 0} is 
independent of (6; , 62)): 


(4.31) (/ Pu az) ¢: + (/ Pr az) ¢i — 2 (/ Pra az) ide 

wo wo wo 

= ki + keds + kage , 
where the k; are new constants easily expressible in terms of the k; , pi; aE, 
wo 
and the coefficients in (4.28). The k; must be determined so as to 
satisty pdE = a, | pi dE = 0, / p2dE = 0, which, owing to (4.29), 
wo wo wo 

reduce to | pdE = a, [ opdE = 0, [| gyp dE = 0, respectively. Using 

wo wo wo 
these relationships, the inequality (4.31) will further simplify to 


([ sip de + Aua)oi+(f opae + Ana) oi 


(4.32) 


- 2 (/ digo p dE + Ana) oi d2 ki + kad: + kaee. 
wo 


Here the sign > applies in w) and < outside wo. The region described by this 
inequality is obviously the region outside an ellipse in the ¢; , ¢2-plane. 





228 STANLEY L. ISAACSON 


5. Generalization of the theory to the k-parameter case. We shall now in- 
dicate how to generalize the theory of Sections 3 and 4 to the case where we 
have k parameters, where k > 2. Our main task here will be to obtain a generali- 
zation of Lemma | and Theorem 1 of Section 4. 

The power function is now designated by 8(@;, 62, ++: , 0 |w), and we are 
testing the hypothesis that (6,, 6.,---, 6) = (0, 0,---, 0). For brevity we 
write 6 = (6;, 02, -°-- , 0), so that 8(@ | w) now will symbolize the power func- 
tion and the hypothesis is @ = 0 = (0, 0,--- , 0). 8:(w) and 8;;(w) will again 
denote the partial derivatives of 8(@| w) evaluated at 6 = 0, where now i and 
j Tun from 1 to k. 


We now define the generalized Gaussian curvature of 8(@|w) at @ = 0 as 
follows: 


Bu(w) --+ Bu(w) 


Baw) ++ Bea(w) a det By 


(1+ > OD (1 + > si(w)) 


j=l j=l 


where 


Bu(w) ~ By. (w) 


Be=| ; 
Bir(w) +++ Bee(w) 


The generalized Gaussian curvature is invariant under translation and rotation 
of the coordinate axes in the (k + 1)-dimensional space of (8, 0:, 62, +--+ , Ox). 
Imposing our condition of unbiasedness on 8(6@ | w) at 6 = 0 gives us B,(w) = 0, 
j = 1,--+,k; and hence we have for 6(6| w) at 6 = 0 


(5.2) K = det B,. 


In view of this discussion, our definition of a region of type D in Section 3 
immediately generalizes to k parameters. 

Geometrically, the region of type D may be regarded as minimizing the 
volume of a certain infinitesimal k-dimensional ellipsoid >~‘,;-, 8:60; = 4, 
as explained in detail in Section 3 for the case k = 2. 

We again assume the fundamental condition at the beginning of Section 4 is 


satisfied. We use the notation of (4.1), (4.2), and (4.3); ie., Bi(w) = [ p: dE, 


and 6,;(w) = [ pi; dE, where now i,j = 1,--- , k, and we let p(E|0) = p. 


In terms of these p’s our conditions on a type D region are expressed by I’-IV’ 
of Section 4, with 7 and j running from 1 to k. 

We thus see that in the k-parameter theory we need a theorem which tells 
how to characterize the structure of a region which maximizes the determinant 





UNBIASED TESTS 229 


of a symmetric positive definite k x k matrix, whose elements are integrals over 
the region, subject to integral side conditions. To this end, we obtain generaliza- 
tions of Lemma 1 and Theorem | of Section 4. 

We generalize the statement of Lemma 1 by replacing 2 by k whenever a 2 
occurs in the statement. Relation (4.9) is replaced by 


> kingi = x kf; in wo, 
i=l 


i=l 


(5.3) A 
x k; Gii < me, kif; outside Wo, 


i=l] t=] 


k 
ki = IL | gy db. 


j=l “wo 


jpst 
The proof of the lemma then proceeds exactly as it does in the case k = 2. 
The corollary to Lemma 1 is now given in the following form: 
CorRoLuarRy. Consider a symmetric matrix 


gu*** Ji 


Jki*** Jee} 


whose elements are given integrable functions defined in an n-dimensional space. 
For any region w in this space, let 


[ gu dE 72¢ | Jiuk dE | 
| %w w | 
Cw = . 


| gu dE --: | gix dE 


} 
Now if we is a region that satisfies the conditions (4.7), (4.8), and (5.3) of the 


lemma, and if furthermore | gi;dE = 0 when i # j, then det G., > det Gy, 


wo 
where w is any region in the space for which Gy is positive definite and the condi- 
tions (4.7) are satisfied. 

PROOF. 


k . k 
(5.4) det G., = II | gdE >T] | 9, ak > det Ge, 
j=l wo j=l Yu 
where the first inequality follows from the lemma and the second is a well 
known inequality for positive definite matrices (see Cramér [4], p. 116). 
Proceeding to Theorem 1, we generalize the statement by once again re- 


placing 2 by k whenever a 2 occurs in the statement. Relation (4.19) is 





230 STANLEY L. ISAACSON 


replaced by 
k 


ky ij > > ks fi in Wo, 
i, jal im] 


(5.5) 


k m 
z ki9g < > kf; outside wp, 
i, joel i=l 
where k;; is the (7, 7) element in the adjoint matrix of G,,. We note that 
> > kisgiz = trace [(adj G,,)G], where adj G,,, denotes the adjoint matrix 
of G,, . The proof of the theorem then proceeds exactly as it does in the case 
k = 2. 

Regions of type D in the k-parameter theory remain invariant under trans- 
formations of the parameter space which are locally one-to-one and twice dif- 
ferentiable with continuous partial derivatives. This result is obtained by a 
direct and immediate generalization of Theorem 2 in Section 4. 

As in the two-parameter theory, the inequalities which must hold within and 
outside the unbiased critical region of type D can frequently be simplified if we 
express them in terms of the derivatives of log p(E | @). We write: 

_ 8 log p(E | 8) 


06, lomo” 


ile 3d log p(E | 6) | 
' 30,09, mo” 


(5.6) 


where ¢, s = 1, 2, +++ , k. In particular the simplification will be considerable if 


(5.7) ru = Ats + > Bu; 5 iis = 1 2, =e, ie k, 


j=l 


where A,, and the B,,; are independent of the sample point E but may depend 
on @. An unbiased critical region of type D found by application of Theorem 1 
will then be the outside of an ellipsoid in the space of the ¢@ ,t = 1,---, k. 


6. Examples. Suppose that the joint density functions specified by the ad- 
missible hypotheses are all of the form 


P(E | Mi; Ls) 
(6.1) Be 2 4 ; b ot 3 


with known o; and o., for —-x < 2, < 


1,2,--+, (m+ m). Thus it 
is assumed that the observations represent two samples of n; and nz individuals 
respectively, randomly and independently drawn from two normal populations 
with known standard deviations o; and o2 respectively and with unknown means 
equal respectively to uw; and ye. The simple hypothesis Hy to be tested is that 
(41 , w2) = (0, 0). We shall find the unbiased critical region of type D for test- 
ing Hy. 





UNBIASED TESTS 231 


The joint density function (6.1), as is well known, satisfies all the conditions 
required in the present theory. Making some simple calculations and substi- 
tuting into (4.32), we obtain 


nz 2 Nok 2 
f&, %) = (/ (*F") p dE — "e) ("7") 
wo oi oi o2 
t.\? _  ma\ (mz\? 
(6.2) 4 [ pak —-~-)(“F 
wo o2 o2 o1 
2 T12 /, 7” 
~_ 2( nue 3s pak) (2337) — kK — hy — OF bo 
wo 0192 7102 a1 02 


as the inequality defining the critical region wo we seek, providing it can be 
found by our methods, where #, and Z%, denote the means of the two samples. 
It is seen that wo is bounded by a surface corresponding to the equation 


f(Z., %) = constant, and that, therefore, the conditions | pdE = a, 


fi opdE = 0, z ¢xyp dE = 0, which the critical region has to satisfy, and 


also the integrals involved in (4.32) can be expressed by means of integrals 
taken over a region wo in the plane of #, and % , determined by the same in- 
equality (6.2). Of course, instead of the original joint density function p(E | 0, 0), 
we shall have that of Z, and Z,. We further simplify our notation by introduc- 
ing, instead of 7, and #, , the variables 


(6.3) u= Vn;ti/o1, v = Vnekr/or. 


Our problem will now be to guess a region wo in the u, v-plane and then see if 
we can determine the constants k; so that the plane region determined by the 
inequality 


pba ,u plu, v) dudv—a)v + 1 Up(u, v)du dv — a 
gion 


“(IL , uvp(u, v)du dv) she a oe ve u+ k, — vm, 


will be the region wo , where wo satisfies the following conditions: 


(6.4) 


(6.5) I y» Pu, v) du dv = a; 


(6.6) Il... up(u, v) du dv = 0, [[.. vp(u, v) du dv = 0 


[[.. u'p(u, v) dudv — a I ,, uvp(u, v) du dv 
wo wo 


(6.7) 
,, uvp(u, v) du dv I 0 plu, v) dudv — a 
wo 


wo 





232 STANLEY L. ISAACSON 


is positive definite, where 
_ p(u, v) = (2m) exp [—4(u' + v*)] 


(6.5) is the condition of size; and (6.6), (6.7) are the conditions of unbiasedness. 
If we have such a region, then by Theorem 1, wo is an unbiased critical region 
of type D for testing (u:, uz) = (0, 0). In the wu, v-plane, the likelihood ratio 
test indicates the region u’ + v° > a’ for testing Hy , where a’ is determined so 
as to give size a to the test. Since u’ and v” are each independently distributed 
as x’ with one degree of freedom, u* + v’ is distributed as x° with two degrees 
of freedom and so a’ can be obtained from a x’-table. We shall take u? + v* > a’ 
as the region w) and shall verify that k, , kz, ks in (6.4) can indeed be deter- 
mined so as to give rise to this region. We will also see that (6.7) is satisfied 
for this region. Then since u° + v° > a’ obviously satisfies the condition (6.6) 
by symmetry considerations, and a’ has been determined so as to satisfy (6.5), 
this will prove that u’ + v° > a’ is an unbiased critical region of type D for 
testing Ho. 
One can easily verify that 


(6.9) I u p(u, v) du dv = [| v plu, v) dudv = a(1 + ha’); 
u2+v2 >a? JJ y2+22>02 


and since p(u, v) is an even function, 


(6.10) ” I uwplu,v) dudv = 0. 


2+y2>a* 


In view of these relations, we see that the matrix in (6.7) is 


oa 2 0 ) 
0 aa /2, 


which is obviously positive definite. Also, (6.4) can now be written as 


» 


. mine) aa > Fz , ni 
(6.11) s 9} — lu tr ’)| a nh ko ’ — Ur 


01921 o} 


If we choose ki = nynoaa*/(20703), ks = 0, k; = 0, the inequality (6.11) becomes 
(6.12) 


and this proves our result. 

This result can in turn be used to find an unbiased critical region of type D 
for testing a simple hypothesis about the means of a bivariate normal popula- 
tion with known covariance matrix, since it is possible by an orthogonal trans- 
formation of variables to transform this problem into the one we have solved. 

The result of (6.12) can also be immediately extended to find an unbiased 
critical region of type D for testing a simple hypothesis about the means of k 
independent normal populations with known variances; and then this latter 





UNBIASED TESTS 233 


result can be used to find a type D region for testing a simple hypothesis about 
the means of a k-variate normal distribution with known covariance matrix. 
The type D regions in these cases turn out to be the likelihood ratio tests. 

My attempts to find an unbiased critical region of type D for testing a simple 
hypothesis about the mean and variance of a univariate normal distribution on 
the basis of a random sample of size n were unsuccessful because I was unable 
to evaluate the integrals occurring on the left side of our basic inequality (4.19) 
over the conjectured region; there were also other difficulties involved. One 
can, however, use the result of (6.12) for large sample sizes to approximate 
a type D region for testing the simple hypothesis (u, o°) = (wo, 0). Since 
i= : Ph a; and (s’)* = — Dau (2; — #)* are joint sufficient statistics 
for u and o’, just as we reduced the problem of testing a simple hypothesis 
about the means of two normal populations to a problem in the %,, Z2-plane by 
use of (6.2), so we can reduce the problem of testing (u, o°) = (uo, 0) toa 
problem in the Z, (s’)’-plane. The density function of # is normal with mean 
uo and variance o)/n under the null hypothesis, and the density function of 
(n — 1)(s’)’/o} is that of x* with (n — 1) degrees of freedom under the null 
hypothesis; since @ and (s’)* are independently distributed in a normal popu- 
lation, we can use these two density functions immediately to obtain the joint 
density function of # and (s’)*. The problem of finding a type D region in the 
&, (s’)’-plane is, however, the one I was unable to solve. But we know that 
(n — 1)(s’)*/o5 has a x’ distribution with mean (n — 1) and variance 2(n — 1) 
and we also know that a x’ distribution with m degrees of freedom is asymptoti- 
cally normal with mean m and variance 2m; hence we know that (s’)’ is asymp- 
totically normally distributed with mean 0) and variance 205/(n — 1) under the 
null hypothesis. If we approximate the density function of (s’)’ by a normal 
density function with mean 95 and variance 205/(n — 1), and let 
(6.13) ye VME = wo) Vn = 1(o)" = 0) 

0 V 205 
then with this approximation our problem becomes that of finding a region wo in 
the u, v-plane satisfying (6.4), subject to conditions (6.5)—(6.7), where p(u, v) 
is given by (6.8), and in (6.4) m = n, nme = (n — 1), 0; = a, o2 = 1/20; . For 
this problem we have seen that the solution is given by (6.12). In the 2, (s’)’- 
plane this gives the region 


n(= — wo)? _ (n — 1)((s’)? — of)? 
—- — die —~ > We 


(6.14) 





2 i 5.4 
On 205 


where a’ is determined from a x’-table with two degrees of freedom. For large 
sample sizes this region should be a good approximation to an unbiased critical 
region of type D for testing (u, 0) = (uo, oo). 


7. Remarks on the theory of testing composite hypotheses with two or more 
constraints. A composite hypothesis with k constraints is a hypothesis which 





234 STANLEY L. ISAACSON 


specifies the values of k parameters out of a total of s parameters, where k < s. 
At present the theory of composite hypotheses with two or more constraints is 
in much less satisfactory shape than the theory of composite hypotheses with one 
constraint. (For the latter see Scheffé [5] and Lehmann [6].) We can define an 
unbiased critical region of type E for testing a composite hypothesis with k 
constraints (k > 2, k < s) as follows: 

DEFINITION. Let © = (6:, 62, °+* , Ox, Ox41,°°* » Os) = (01, 02, °°, &, T) 
denote the parameter point in the parameter space 2 which is a subset of an s-dimen- 
stonal Euclidean space, where r = (0441, °** , 0.) denotes the nuisance parameters 
(i.e., the parameters unspecified by the hypothesis). The hypothesis Ho states 9 
lies in the k-dimensional subspace w of Q defined by @ = 6, where 6 = 
(0; , 02, ++ , O%) and 0 = (0, A, °°: , Oo). Then wo ts said to be an unbiased 
critical region of type E for testing Ho if for all @ in w (i.e., all (0, r)): 

I. B(6, 7 | Wo) = a, where a is independent of 7; 

II. 60 ,7| wo) = Ofori = 1,--- ,k; 

Bur(O0, 7 | Wo) - ++ Bix(Oo, 7 | wo) 
IIT. = B,,, is positive definite; 
Bir (Oo »T | U») -*+ Bux(Oo , 7 | Wo) 

IV. det B,,, > det B, for any region w satisfying I-III. 

These regions of type E should prove useful in the further development of 
the theory of composite hypotheses with two or more constraints. 


I am deeply indebted to Professor Henry Scheffé for having suggested this 
line of research to me and for numerous very helpful suggestions he has given 
me in the course of pursuing it. 


REFERENCES 


[1] J. NeyMAN ANnp E. 8. Pearson, ‘‘Contributions to the theory of testing statistical 
.? 


hypotheses, part I,’’ Stat. Res. Memoirs, Vol. 1 (1936), pp. 1-37. 

J. NEYMAN AND E. 5S. Pearson, “Contributions to the theory of testing statistical hy- 
potheses, parts II and ITI,’”’ Stat. Res. Memoirs, Vol. 2 (1938), pp. 25-57. 

L. E1senunart, Differential Geometry, Ginn and Co., 1909. 


}] H. Cramér, Mathematical Methods of Statistics, Princeton University Press, 1946 


[2] 
i<J 


[5] H. Scuerrf, ‘On the theory of testing composite hypotheses with one constraint,” 
Annals of Math. Stat., Vol. 13 (1942), pp. 280-293 
| E. L. Lenmann, “On optimum tests of composite hypotheses with one constraint,” 
Annals of Math. Stat., Vol. 18 (1947), pp. 473-494. 





DESIGNS FOR TWO-WAY ELIMINATION OF HETEROGENEITY 


By S. 8. SarrkHanpe 
University of North Carolina 


1. Introduction and summary. Sometimes in a design the position within the 
block is important as a source of variation, and the experiment gains in effi- 
ciency by eliminating the positional effect. The classical example is due to 
Youden in his studies on the tobacco mosaic virus [1]. He found that the response 
to treatments also depends on the position of the leaf on the plant. If the num- 
ber of leaves is sufficient so that every treatment can be applied to one leaf of a 
tree, then we get an ordinary Latin square, in which the trees are columns and 
the leaves belonging to the same position constitute the rows. But if the num- 
ber of treatments is larger than the number of leaf positions available, then we 
must have incomplete columns. Youden used a design in which the columns 
constituted a balanced incomplete block design, whereas the rows were com- 
plete. These designs are known as Youden’s squares, and can be used when 
two-way elimination of heterogeneity is desired. 

In Fisher and Yates statistical tables [2] balanced incomplete block designs 
in which the number of blocks b is equal to the number of treatments v have been 
used to obtain Youden’s squares, and the authors state that “‘in all cases of 
practical importance” it has been found possible to convert balanced incomplete 
blocks of the above kind to a Youden’s square by so ordering the varieties in 
the blocks that each variety occurs once in each position. F. W. Levi noted 
([3], p. 6) that this reordering can always be done, in virtue of a theorem given 
by Konig [4] which states that an even regular graph of degree m is the product 
of m regular graphs of degree 1. Smith and Hartley [5] give a practical procedure 
for converting balanced incomplete blocks with b = v into Youden’s squares. 

In this paper I have considered some general classes of designs for two-way 
elimination of heterogeneity. In Section 3 balanced incomplete block designs 
for which b = mv have been used to obtain two-way designs in which each 
treatment occurs in a given position m times. The case m = 1 gives Youden’s 
squares. In Section 4 it has been shown that balanced incomplete block designs 
for which 6 is not an integral multiple of v can be used to obtain designs for 
two-way elimination of heterogeneity in which there are two accuracies (i.e., 
some pairs of treatments are compared with one accuracy, while other pairs are 
compared with a different accuracy) as in the case of lattice designs for one-way 
elimination of heterogeneity. In Sections 5 and 6 partially balanced designs 
have been used to obtain two-way designs with two accuracies. In every case 
the method of analysis and tables of actual designs have been given. 


2. Notation and preliminaries. Consider a two-way design with k rows and 
b columns. Let there be v treatments altogether, and let n;; denote the number 
235 





236 S. S. SHRIKHANDE 


of times the treatment 7 occurs in row j, and nj. the number of times it occurs 
in the column c. If y;. is the yield for the jth row and cth column, the mathemati- 
cal model assumed will be 


(2.0) Vie @= OT t; + b; tT De T Cie; 


where ¢; is the effect of the treatment 7 occurring in the row j and column c, 
b; and p, are the effects of the jth row and cth column, respectively, and ¢;. is 
a random variable which is distributed N(0, o) independently for each value 
of j and e¢. 

Let T,, B; and B, denote respectively the totals of the yields corresponding 
to the treatment 7, row j and column c. Put 

; on ] : l i , , r; G 

(2.1) T. — 5 2; mg By — 7 Lee Bet 
where G is the grand total of all the yields and r, is the number of replications 
of the ith treatment. Q; is called the adjusted yield of the ith treatment. 

Let us set 


r; ] 2 ae 
i{1 ~ I — = 25 Nis — = Ze Nie; 
"| ( + 5) ba . 


bs Baie 
(2.3) = —- J; ij Muy — ~ Le NicNue + 


ilu fete 


h k bk 
It can be easily shown that the rank of the matrix (c,,,) is at the most equal to 
v — 1. We shall suppose that the parameters entering in the design are such 
that rank (c;,) is actually equal to v — 1. In this case the design is said to be 
connected. The best unbiased linear estimate of any contrast 


(2.4) Lt; + Inte + » Sc + Ll, t,, 
is obtained by solving the normal equations 
(2.5) Cah, + Cite + ae + Cisty = v;; t = si. °°" » UV, 


and substituting the values in the contrast. The ¢’s are determined up to an 
arbitrary constant, and may be made unique by using the constraint 


(2.6) ht+t+---+4=0. 


Let t; , 4, --- ,%, be any solution of (2.4). Then the analysis of variance table 
for the design will be Table I. Detailed proofs of the facts stated in this section 
can be worked out along the lines indicated by Bose [6]. 


3. Designs with complete rows in which every treatment occurs in a row 
m times. Consider a two-way design in which the columns form a balanced in- 
complete block design with parameters v, b, r, k, 4, where v is the number of 
treatments, ) is the number of blocks, k is the block size, r is the number of 
replications of each treatment, and \ is the number of times any two treatments 





ELIMINATION OF HETEROGENEITY 


occur together in the same column. Then 


(3.0) > n2 =rm=r } 9 


“e''ic ’ ne v, 
, / ° . 
(3.1) VeNieNue = XA, 2,°°',03t #U. 


Consider the matrix N = (n,;) of v rows and k columns, n;; being the number 
already defined. The matrix N is intrinsically associated with the positions 
of the treatments within the columns of the design, and depends on the parame- 
ters of the design only inasmuch as each column adds up to b and each row to 


TABLE I 


Analysis of variance for a two-way design 


Source of variation d.f. Sum of squares Mean square 
7 
Treatment contrasts 

eliminating rows 

and columns 


Row contrasts ignor 
ing treatments 


Column contrasts ae 
ignoring treatments 4 Ft a 


(b-1)k-1)| : ; 
Error Si(by subtraction = 
het (b—1)(k—1)— (v1) 


Total — 1 


,df.v — i, (6— 1I)(K—1) — (wv — 1). 


r. Let b be an integral multiple of v, so that b = mv. Then r = mk. By suitable 
interchanges of treatments in the same column of the design, the matrix N can 
be so modified that 


(3.2) nij = M, ~=1,2 °° 057 = I 2 -++ k, 


9 &» 7“) 
since the procedure of Smith and Hartley [5] can be easily generalized in the 
following manner to cover the case m ¥ 1. 

If n;; = m is not satisfied for all values of 7, 7, we define 
mij = mM — nj ifm> nj, 
= 0 ifm < ni;, 

M = =mj;. 


Then, following the Smith-Hartley procedure, only slight modifications in the 
argument show that we can find an interchange or system of interchanges within 





238 S. S. SHRIKHANDE 


columns which would reduce M by at least unity. Successive applications of this 
process give the desired result, since M = 0 implies n;; = m for all ¢, 7. 
Now we have 


(3.3) Zjni; = km’, i= 1,2,--:,», 


(3.4) ZjNizNuzy = km’, i,u = 1,2,+++, 038 ¥ u. 


Under the restraint (2.6) the normal equations (2.5) become 


which are exactly the same as for balanced incomplete block designs (cf. Bose 
[6]). Hence 2; , the estimate of ¢; , is Q;/rE, where E = ud/kr, and the analysis 
of variance can be obtained by substituting this value of 7; in the table at the 
end of Section 2. Also 


(3.6) Vis — ty) = 20°/rE. 


When a cyclic or multicyclic solution of a balanced incomplete block design 
is available the matrix N already obeys the condition (3.2), and an actual appli- 
cation of the Smith-Hartley process is unnecessary. Only the designs with 
r < 10 are practically important, and cyclic or multicyclic solutions for all 
but three of these designs are available in the tables of Fisher and Yates [2], 
and in a paper by Rao [7]. The solution for the three missing cases are given 
in Table II. The solution for the design 1 of Table II is obtained by modifying 
the corresponding solution by Bose [8], and the solutions for designs 2 and 3 
are obtained from the corresponding solutions by Bhattacharya [9], [10]. The 
designs considered here may be called extended Youden’s square designs when 
m> 1. 

In Table II, instead of giving the design in the row-column form, it is con- 
venient to give the blocks corresponding to the columns. The row position is 
then given by the position within the block. This convention will be adopted 
throughout the paper. In many cases it is possible to represent the designs com- 
pactly by developing a set of blocks from one block cyclically. The fouowing 
convention will be adopted for this purpose. To develop the block (a, b, --- , x) 
cyclically (mod g), we write down the set of g blocks (a + t,b + 4,---,2+ 0), 
t = 0,1, 2,--- ,g — 1, and then reduce every number appearing in the blocks 
to lie between 1 and g (both inclusive) by subtracting g whenever a number 
appearing in the blocks exceeds g. In certain cases to each number between 1 
and g there correspond m treatments instead of one. The treatments correspond- 
ing to c being denoted by 1, c2, --+ , Cm. In this case in developing the blocks 
suffixes are left invariant. Thus by developing (1; , 5: , 4:) cyclically (mod 5), we 
get (1, ; De 9 41), (2, ’ le ’ 51), (3; 9 22 ; 1,), (4, ’ 32 ’ 21), (5; ; 4o 9 31). 

Sometimes treatments are represented by compound symbols (a, b) with or 
without suffixes, and we have to develop a block (mod 4g; , go). This can be done 
analogously. For example, by developing (mod 3, 3) the block 


[(2, 11, (1, 2)r, (2, 2)e, (1, Del, 





ELIMINATION OF HETEROGENEITY 
we get the nine blocks 
[(2, Lr, (1, 2)ry (2, 2)2, (1, 1)e], (8, La, (2, 2), (8, 22, (2, Lal, 
[(1, 11, (3, 2)1, (1, 2)e, (3, Dol, [(2, 2): , (1, 3): , (2, 3)2, (i, 2)o], 
[(3, 2): , (2, 3)1, (3, 3)2, (2, 2)e], (C1, 2, (8, 3), (1, 32, (3, 2)e], 
[(2, 3)1, (1, Li, (2, De, (1, 3)e], «(8 3), (2, Da, (3, Le, (2, 3)a], 
(1, 3), (3, 11, (1, De, (3, 3)e]- 


TABLE II 
Some extended Bouden’s square designs 


| 


Blocks 
| 


Serial | Parameters: 
no. | v, 6, r, k, > 


1 | 10,30,9,3, | (Se, Lz, 22), (li, 52, 41), (21, 31, 52), (1a, 41, 22), (22, 31, 21), (Se, 22, 5); 
2 other blocks are obtained by developing (mod 5), keeping the 
suffixes fixed. 


2 25, 25,9,9, | (5, 1, 23, 6, 20, 12, 17, 2, 11), (18, 21, 5, 7, 10, 24, 3, 12, 1), 
| 3 (15, 2,9, 10, 1, 21, 25, 17, 16), (23, 22,11,9,3, 18, 1, 16, 14), 

(24, 13, 2, 14, 7, 8, 22,1, 17), (8, 25, 20,3, 6, 1, 13, 18, 15), 
(20, 4, 3, 17,8, 10,7, 23,9), (21, 8, 24, 11, 4,6, 2,9, 18), 
(14, 12, 13, 4, 17, 25, 21, 11,3), (3, 24, 17, 22, 15, 5, 16, 4, 6), 
(25, 5, 18, 20, 16, 14, 4,7, 2), (22, 19, 1, 5, 25, 11, 10, 8, 4), 
(19, 14,6, 13,9, 17, 18, 5, 10), (1, 20, 15, 12, 19, 4, 9, 14, 24), 
(16,7, 4, 1, 13, 23,6, 21,19), (12, 16, 8, 23, 24, 9, 5, 25, 13), 
(9, 3, 25, 19, 22, 2, 12,6,7), (7,17, 12, 18, 11, 16, 15, 19,8), 
(13, 11, 10, 16, 2,3, 19, 24, 20), (6, 10, 16, 8, 12, 20, 14, 22, 21), 
(2, 23, 21, 15, 14, 19, 8, 3, 5), (10, 6, 14, 24, 23, 7, 11, 15, 25), 
(17, 18, 19, 25, 21, 22, 24, 20, 23), (4, 15, 22, 2, 18, 13, 23, 10, 12), 

(11, 9, 7, 21, 5, 15, 20, 13, 22) 


31, 31, 10, | (1,2, 28, 15,9, 11,8, 16, 18, 4), (2,3, 22, 16, 10, 17,9, 19, 5, 12), 
10, 3 (3, 4, 23,6, 17, 13, 10, 18, 11, 20), (4, 5, 24, 18, 12, 21, 11, 14, 19, 7), 
(5, 6, 25, 19, 13, 8, 12, 20, 15, 1), (6, 7, 26, 20, 14, 9, 16, 21, 2, 13), 
(7,1, 27, 21,8, 10, 14, 17,3, 15), (9, 12,6, 1, 27, 18, 29, 26, 17, 24), 
(10, 13, 7, 2, 29, 25, 19, 27, 28,18), (11, 14,1,3, 19, 20, 26, 28, 29, 22), 
(12,8, 2, 4, 20, 29, 23, 22, 21, 27), (13,9, 3,5, 21, 15, 28, 23, 24, 29), 
(14, 10, 4, 24, 15, 16, 22,29,6,25), (8,11, 5, 29, 16, 23, 25, 7, 26, 17), 
(15, 24, 20, 11,2, 27,5, 10, 30,26), (16, 25, 21, 12, 3, 28, 30, 11, 27,6), 
(17, 26, 15, 13, 30, 22,7, 12, 4, 28), (18, 27, 16, 14, 5, 30, 1, 13, 22, 23), 
(19, 28, 17,8, 6, 14, 2, 24, 23, 30), (20, 22, 18,9, 7, 24, 3, 30, 25, 8), 
(21, 23, 19, 30, 1, 26, 4, 15,9, 10), (24, 16, 8, 27, 26,3, 31, 4, 13, 19), 
(25, 17, 9, 28, 31, 4, 27, 5, 20, 14), (26, 18, 10, 22, 28, 31, 21,6, 8, 5), 
(27, 19, 11, 23, 22,6, 15, 9,7, 31), (28, 20, 12, 31, 23, 7, 24, 1, 10, 16), 
(22, 21, 13, 25, 24,1,17,2,31,11), (23,15, 14, 26, 25, 2, 18, 31, 12,3), 
(29, 30, 31,7, 4,5, 6,3, 1,2), (30, 31, 29, 10, 11, 12, 13, 8, 14,9), 
(31, 29, 30, 17, 18, 19, 20, 15, 16, 21) 








240 Ss. S. SHRIKHANDE 


4. Other two-way designs obtained from balanced incomplete block designs. 
Balanced incomplete block designs in which the number of blocks (columns) is 
not an integral multiple of the number of treatments can be used to give designs 
with two accuracies for two-way elimination of heterogeneity. This is due to 
the fact that by suitable interchange of treatments in various columns it has 
been possible in every known case where r < 10 to express the design in a form 
such that in the matrix N, already referred to, 

(4.00) dni; Me , poe 2 ** 8 


(4.01) LjNizNus = Me; £,u=1,2,-°- , 038 4, 


where the treatments i and u are e-assogates. These associates are similar to 
the associates defined by Bose and Nair [11]. Thus with respect to any treat- 
ment whatsoever, all the rest can be divided into two groups of associates with 
n, in the first group and m2 in the second. If two treatments are e-associates, 
the number of treatments which are f-associates of the first and g-associ- 
ates of the second is p;, , independent of the particular pair of treatments started 
with. The relation of associates is reciprocal. The relations between the param- 
eters can be derived, following Bose and Nair, as 


(4.1) ren =v—i, 

(4.20) Pro = Ny when e + f, 
(4.21) =n — 1 when e = f, 
(43) ne Php = ye = ny Pty 


The normal equations for the estimation of treatment effects are (2.5), with 


(4.40) re (1 = i + ‘) ~ 3 = a, se ee 


(4.41) Cu = 
where the treatments 7 and wu are e-associates. 


Following the method indicated by Bose [6], a solution of the normal equa- 
tions is found to be 


(4.5) al; = Q; — (BiAn + 2 An)Qi() — (8; Aw + 82 Aw)Q2(t), 


where Q,(z) denotes the sum of the Q’s for all the e-associates of the treatment 
2, and (A,,) is the inverse of the matrix (a.,) whose elements are given by 


(4.6) es — bes ‘4 BeNe — By pa — Be pey ? 


where 6., = 1 or 0, according as e = f ore = f. 





ELIMINATION OF HETEROGENEITY 241 
The analysis of variance can be obtained by substituting for 7; in Table I 
on om A 20° 

(4.7) Vitis — te) = — {1 + BiAre + Be Ace} 
Qa 


if the treatments 7 and wu are e-associates. 
The designs considered here will be said to belong to the class Y,; . The parame- 


ters of some useful designs of this class are given in Table IIIa, and the actual 
designs in the Table IIIb. 


The ratio of the variances of the two different kinds of comparisons is given by 
(4.8) Raw Lt hiAn + An 
1 + By Ai + Bo A 22 


We shall now give a number of useful designs belonging to the class Y; . One 
set of designs is obtained from the orthogonal series designs with the parameters 


(4.90) v = 8’, b=s+ 8, r=s-+ 1, k=s hg 


’ 


the other parameters being 


(4.91) n, = s(s — 1), m=s-—l, mm = s+ 2, be = 8 + 3, 


, s(s — 2) 8 1 . s(s — 1) 
(4.92) (p;,) = , (pyo) = ’ 
So rs fo 0 Po. 


; I 

(4.93) =1+ Fry 5* 

These designs are obtained by using the difference sets of Bose [12]. He has 
shown that if (d; , d2,--- , d,) is the difference set corresponding to s, where s 
is a prime or power of a prime, then a solution of the balanced incomplete block 
design with parameters (4.90) is obtained as follows: 

(i) s* — 1 blocks are obtained by developing the block (d;,,d2,---, ds) 
cyclically (mod s° — 1); 

(ii) s+ 1 other blocks are obtained from the block (0, s + 1, 2(s + 1),---, 
(s — 2)(s + 1), ~) by adding successively the numbers 1, 2, --- , s + 1, 
where © remains invariant under the addition. 

To convert this solution into a two-way design of the class Y; , we keep the 
s’ — 1 blocks (i) unchanged. Also the first two blocks of (ii) are kept unchanged, 
but in the others © is successively moved to the left. Finally replace « by 
s’. For example, the difference set corresponding to s = 3 is (1, 6, 7), and hence 
the blocks of the design corresponding to s = 3 are (1, 6, 7), (2, 7, 8), (3, 8, 1), 
(4, 1, 2), (5, 2, 3), (6, 3, 4), (7, 4, 5), (8, 5, 6); (1, 5, 9), (2, 6, 9), (3, 9, 7); (9, 4, 8). 

The method of identifying the associates is easy. Divide the treatments into 
s groups: (1, 2, ---, 8s), (s + 1,s + 2,---,2s),---,("® —-s+1,s*—8 + 2, 





242 S. S. SHRIKHANDE 


2 . . . . 
- ,s). Any two treatments are l-associates if they are in the different group 
and 2-associates if they are in the same group. 
Bose’s difference sets for s = 2, 3, 4, 5, 7, 8, and 9 are given below. 


8 Difference set 
1,2 
1, 6,7 
1, 3, 4, 12 
1, 3, 16, 17, 20 
1, 2, 5, 11, 31, 36, 38 
1, 6, 8, 14, 38, 48, 49, 52 
1, 13, 35, 48, 49, 66, 72, 74, 77 


The parameters of some other designs of the class Y; are given in Table IIIa. 
The corresponding blocks are given in Table IIIb. In each case the treatments 
can be divided into a number of groups such that the treatments in different 
groups are l-associates, and treatments in the same group are 2-associates. 
These groups are also shown in Table IIIb. 


TABLE IIIa 
Some designs of the class Y,: Parameters 





v, Tr, 
Bi; 


Reference no. (Pjq) (p},) 





6, 
8, 





5, 
5, 


171/170 


5, 4, 
7. 143/142 





ELIMINATION OF HETEROGENEITY 


TABLE IIIb 
Some designs of the class Y,;: Blocks and groups for identifying associates 


Reference no. | Blocks Groups 





| Develop the blocks (11, 2:1, 42, 4:), (21, 12, 3:1, | There are two groups. 
42), (12, 21, 22, 32), (mod 5), keeping the suf-| Treatments with the 
fixes fixed same suffixes belong 
| to the same group 


(6, 2, 3), (4, 3, 2), (8, 5, 4), (6, 5, 4), (5,6, 2), | There are three groups: 
| (3, 1, 5), (2, 4, 1), (1, 6, 3), (4, 1, 6), (5, 2, 1) (1, 2), (3, 4), (5, 6) 


Develop the block (3, 5, 6, 7), (mod 7), There are four groups: 
and add the blocks (8, 2, 1, 4), (8, 5, 3, 2), (3, (1, 2), (3, 4), (5, 6), 
8, 4, 6), (4, 8, 7, 5), (5, 6, 8, 1), (, 7, 2, 8), (7, (7, 8) 

| 1, 8, 3) 


| (2,1,3), (4,1, 5), (4, 6, 2), (8, 9, 1), (12, 8, 4), | There are three groups: 
(8, 10, 2), (3, 138,14), (5, 11, 14), (13, 6,11), (14, (1, 2, 3, 4, 5), (6, 7, 8, 
7,9), (7, 11, 12), (7, 18, 10), (6, 7,1), (2, 5, 7), 9, 10), (11, 12, 13, 14, 
(4, 7, 3), (1, 10, 11), (11, 3, 8), (2, 9, 11), (5, 8, 15) 
13), (1, 12, 13), (9, 4, 13), (12, 2, 14), (14, 6, 8), 
(10, 14, 4), (8, 5, 12), (15, 10, 5), (6, 15, 9), (15, | 
8,7), (11,4, 15), (18, 2, 15), (1, 14, 15), (3, 5, 6), 
(9, 3, 10), (5, 9, 12), (10, 12, 6) 


(8, 2,4, 10, 6), (7,8, 10, 2,1), (8, 8,9,4,7), (9, | There are five groups: 
10, 1, 8, 5), (2, 5, 1, 10, 3), (10, 3,4, 1,6), 6,1, ' (1, 2), (3, 4), (5, 6), 
9, 5, 4), (5, 6, 8, 2, 9), (1, 6, 7, 3, 8), (4, 9, 2, 7, (7, 8), (9, 10) 

10), (5, 10, 3, 9, 7), (6, 7, 2,9, 1), (9, 1,3, 4, 2), 

(4, 5, 8, 3, 2), (7, 2, 6, 5, 3), (3, 9, 10, 6, 8), (8, 

7, 5, 1, 4), (10, 4, 7, 6, 5) 


(1, 2, 7, 8, 18, 14), (5, 18, 14, 12, 6, 11), (3, 10, | There are two groups: 
13, 9, 4, 14), (5, 6, 3, 15, 16, 4), (7, 9, 8, 10, 16, (1, 2, 3, 4, 5, 6, 7, 8), 
15), (1, 2, 15, 16, 12, 11), (6,3, 1, 15,8, 13), (7, | (9, 10, 11, 12, 13, 14, 
15, 5, 13, 10, 12), (9, 11, 4, 13, 15, 2), (14, 7, 4, 15, 16) 

16, 2, 5), (8, 14, 6, 9, 11, 16), (12, 3, 1, 10, 14, 

16), (16, 4,6, 1, 13,7), (10, 8, 16, 11, 5, 13), (13, | 

16, 2, 12, 9, 3), (8, 5, 2, 14, 3, 15), (15, 12, 7, 6, 

14, 9), (11, 4, 10, 14, 15, 1), (6, 5, 9, 2, 1, 10), (3, 

1, 5, 7, 11, 9), (4,1, 8, 5, 9, 12), (4, 7, 3, 11, 12, 

8), (2, 8, 12, 4, 10, 6), (2, 6, 11, 3, 7, 10) 


5. Partial and extended partial Youden squares. We have seen how balanced 
incomplete block designs can be used for obtaining designs for two-way elimina- 
tion of heterogeneity. In this and the following section we shall consider the 
use of partially balanced designs [11], [13] for the same purpose. The case when 
b = v has already been considered by Bose and Kishen [14]. They call these 





244 S. S. SHRIKHANDE 


designs partial Youden’s squares. In this section we shall consider the case 
b = mv, r = mk, when m + 1; we may call these designs extended partial 
Youden’s squares. 


TABLE IV 


Cyclic solutions to partially balanced incomplete block designs leading to 
partial and extended partial Youden’s squares 


(Pye ‘ Solution 


Develop (mod 13) the block 
(1, 3, 9). 


Develop (mod 15) the blocks 
(1, 7, 9) and (1, 12, 15 


Develop (mod 15) the block 
(1, 3, 4, 12 


Develop (mod 17) the block 
(1, 9, 13, 15, 16, 8, 4, 2) 


Develop (mod 24) the block 
1, 3, 16, 17, 20). 


Develop (mod 5, 5) the 
blocks (1, 5), (1, 4), (3, 
1)] and [(3, 5), (3,2), (4,3). 


Develop (mod 26) the block 
@,: 1; 2, 8, 11, 18, 2, 2, 
92 


Develop (mod 29) the block 


1, 16, 24, 7, 25, 23, 2 


Develop (mod 48) the block 
., ae ty oa, OB. 


Develop (mod 63) the block 
1, 6, 8, 14, 38, 48, 49, 52 


0 Develop (mod 80) the block 
6 (1, 18, 35, 48, 49, 66, 72, 
74, 77 


Suppose there exists a partially balanced design with | different kinds of 
associates, and parameters v, b, r, k; X41, de, +++, Arg M1, Mey, Me Pro (e, f, 
g = 1,2,-+--, 2). When b = m, r = mk, then Smith and Hartley’s process 





ELIMINATION OF HETEROGENEITY 245 


can be used just as in Section 3 to so modify the design that each treatment 
occurs just m times in each row (the columns constituting the blocks). In this 
case we have 


e SNES AES. So RPE et 1). 
(5.0) Ci = r(1 T = h i => r(1 i: = d, 


ss km Ne r 
(5.1) a ee + bk 
so that the normal equations take exactly the same form as for partially bal- 
anced incomplete block designs. Hence a solution of the normal equations is 
given by equations (4.5) and (4.6) of Section 4, with a and 8, now given by 
(5.0) and (5.1). The equation (4.7) is also valid. In case there are only two kinds 
of associates, the ratio of the variances of the two kinds of comparisons is given 
by (4.8). 

When a cyclic solution to a partially balanced incomplete block design with 
b = mv, r = mk is available, then it can be directly used as a two-way design 
without further modification. A number of cyclic solutions have been given 
by Bose and Nair [11]. Cyclic solutions to a number of new designs are given 
in Table IV. In each case 1 = 2. 


6. Other two-way designs obtained from partially balanced incomplete block 
designs. Under certain conditions it is possible to use a partially balanced de- 
sign with two types of associates to give a two-way design with two types of 


accuracies even when } is not an integral multiple of v. The necessary condition 
is 


' b 
(6.0) a =m+1 or mt+il. 
: r 


In this case it has been found that in all cases of practical interest we can, 
by suitable interchanges within columns, arrange that 


> 2 
(6.1 Zni; = d, 
(6.2 ZNij Nas = Mes 1,uU=1,2,---,031 ¥ u;e = 1, 2, 


where two treatments which are e-associates for the columns are also e-asso- 
ciates for the rows. 
In this case 


ag = r\ _d 
(6.3) Ci = r(1 os x) b 


_ He _ Xe r 


(6.4) Cu = h k + bk ~ 


Be, 1u=1,2,-:-,v3tF u. 

The analysis is the same as in Section 4, the equations (4.5), (4.6), (4.7), (4.8) 
remaining valid but a and 8 now given by (6.3) and (6.4). The analysis of vari- 
ance is obtained by substituting for 7; in Table I. 





246 S. S. SHRIKHANDE 


The designs considered here may be said to belong to class Y; . Some designs 
of this class are given below in Tables Va and Vb. The parameters are given 
in Table Va whereas the actual solutions appear in Table Vb. In this case the 
representation is such that two treatments in the same group are 1-associates, 
whereas two treatments in different groups are 2-associates. These groups are 
also shown in Table Vb. 


TABLE Va 
Some designs of the class Y2: Parameters 


Reference ; n1, nz 1) 
a. s d (Py, 





0 O 
0 10 








TABLE Vb 
Some designs of the class Y2: Blocks and groups for identifying associates 





Reference 


Blocks Groups 
no. 


(8,11, 5, 7,1, 2), (2,3, 1,8, 7,9), (3,4,10,9, | There are six groups: (1, 7), 
2,8), (4, 10, 11, 5, 9, 3), (5,1, 4,11, 10,7), | (2, 8), (4, 9), (9, 10), (5, 11), 
(6, 5,8, 2,12, 11), (9, 12, 7,6, 3, 1), (10, 6, 2, (6, 12). 
12,8, 4), (11, 9, 12,3, 6, 5), (12, 7,6, 1, 4, 10) 


(10, 6, 4), (3, 7, 5), (11, 2, 13), (1,9, 8), 44, | There are three groups: (1, 3, 
12, 15), (11, 12, 5), (3,6,8), (14,7,4), (10,9, 10, 11, 14), (2,6, 7, 9, 12), (4, 
13), (1,2, 15), (1, 7, 18), (8, 2, 4), (10, 12, 8), §, 8, 13, 15). 
(14, 9, 5), (11, 6, 15), (15, 3, 9), (4,1, 12), (5, 

| 10, 2), (8, 11, 7), (13, 14, 6), (2, 8, 14), (6, 5, 

| 1), (9, 4, 11), (7, 5, 10), (12, 13, 3) 


My sincerest thanks are due to Professor R. C. Bose, under whose guidance 
this research was carried out. 


REFERENCES 


[1] W. J. Youpgen, “Use of incomplete block replications in estimating tobacco mosaic 
virus,’’ Contributions from Boyce Thompson Institute, Vol. 9 (1937), pp. 317-326. 

(2) R.A. FisHer anv F. Yates, Statistical Tables, 3rd ed., Hafner Publishing Co., New 
York, 1948. 

[3] F. W. Levi, Finite Geometrical Systems, University of Calcutta, 1942. . 

[4] J. Konia, Theorie der endlichen und unendlichen Graphen, 1936. 





ELIMINATION OF HETEROGENEITY 247 


[5] C. A. B. Smirx anp H. O. Harttey, ‘‘The construction of Youden squares,’’ Jour. 
Roy. Stat. Soc., Ser. B, Vol. 10 (1948), pp. 262-263. 
[6] R. C. Boss, ‘“‘Least square aspects of analysis of variance,’’ mimeographed notes, 
Institute of Statistics, University of North Carolina. 
[7] C. R. Rao, “Difference sets and combinatorial arrangements derivable from finite 
geometries,’’ Proc. Nat. Inst. Sci. India, Vol. 12 (1946), pp. 123-135. 
[8] R. C. Boss, ‘On the construction of balanced incomplete block designs,’’ Annals of 
Eugenics, Vol. 9 (1939), pp. 353-399. 
(9] K. N. Brarracuarya, “On a new symmetrical balanced incomplete block design,”’ 
Bull. Calcutta Math. Soc., Vol. 36 (1944), pp. 91-96. 
{10} K. N. Baarracnarya, “‘A new solution in balanced incomplete block designs,”’ 
Sankhyd, Vol. 7 (1946), pp. 423-424. 
[11] R. C. Boss ann K. R. Narr, ‘‘Partially balanced incomplete block designs,’’ Sankhyd, 
Vol. 4 (1939), pp. 337-373. 
{12] R. C. Boss, ‘‘An affine analogue of Singer’s theorem,”’ Jour. Indian Math. Soc. (new 
series), Vol. 6 (1942), pp. 1-16. 
(13) R. C. Boss, “Recent work on incomplete block design in India,’’ Biometrics, Vol. 3 
(1947), pp. 176-178. 
(14) R. C. Bose anv K. Kisuen, “On partially balanced Youden’s squares,’’ Science and 
Culture, Vol. 5 (1939), pp. 136-137. 





GENERALIZED HIT PROBABILITIES WITH A GAUSSIAN 
TARGET! 


By D. A. 8. Fraser 
University of Toronto 


1. Summary. A general discrete distribution is obtained whose random vari- 


able is the number of ‘hits’ on a target. The target is k-dimensional and Gaussian 
diffuse, that is, the probability of a hit is given to within a constant factor by a 
Gaussian probability density function of the position of the “trajectory” in 
k dimensions. For a series of n rounds, the n positions of the trajectory have a 


multivariate Gaussian distribution. An expression is given, using Theorems 1 
and 2 or 1 and 3, for the probability of r hits as a linear combination of probabili- 
ties of all hits on each possible set of rounds. Theorems 4, 5, and 6, with Theorem 
1, give three limiting distributions as n, the number of rounds, tends to infinity. 
Theorems 7, 8, and 9, with Theorem 1, present three other limiting cases, and 
Theorems 10 and 1 give a time average result. 


2. The problem. In [1], L. B. C. Cunningham and H. R. B. Hynd proposed 
a problem in multivariate statistics: to find the probability of at least one hit 
when an automatic gun is used against a moving target. Because of inability in 
aiming, the point of aim, by which we mean the centre of the distribution of the 
shell trajectory, will not always be the centre of the target. In fact, while the 
gun is being fired, the point of aim is found to wander back and forth across the 
target. The main complication in the problem arises in taking account of the 
dependence between the successive points of aim at the instants of firing. 

In [1] the problem is given an approximate solution covering a partial range 
of parameter values and assuming the target has a circular outline. 

Here the problem is modified by using a Gaussian diffuse target, a target for 
which the probability of a hit is given to within a constant factor by a Gaussian 
probability density function of the position of the trajectory. From a target 
which is essentially two-dimensional for aiming, the problem is generalized to a 
target in k dimensions, having in mind the possibility of application to other 
problems. 

If we assume the target to be a Gaussian diffuse target and the position of 
the trajectory to be distributed according to a two-dimensional Gaussian dis- 
tribution about the point of aim, then the probability of a “hit” as a function 
of the point of aim also has the form of a Gaussian diffuse target; that is, it is 
a constant times a Gaussian pdf of the point of aim. This will be discussed in 
a later paper, where the general theory will be applied to the two-dimensional 
problem as proposed by Cunningham and Hynd and a method of numerical 
evaluation considered and applied to an example. 

! Prepared in connection with research sponsored by the Office of Naval Research 
248 





HIT PROBABILITIES 249 


For the general theory we shall start with the Gaussian diffuse target in terms 
of the point of aim, consider it in k dimensions, and call it a “‘success function.” 
The point of aim is a random variable in k dimensions and will be called a pre- 
diction. If the prediction yields a hit we shall speak of a successful prediction. 

The abstracted statistical problem may be stated as follows. In a series of n 
predictions having a joint distribution, find the probability distribution of the 
number R of successful predictions. Let the ith prediction be X, = 
(Xu, Xe, +++, Xe.) = {X,:} where » ranges over the set (1, 2,---, k). A 
prediction X; = &; becomes a successful prediction with probability given by 
the success function si(z,), that is, 


Pr{Successful prediction | X; = Z;} = s,(%,), 


where 0 < 8,(%;) < 1. 

In the following theory the problem is solved when the predictions have a 
Gaussian distribution and the success functions have the form of a Gaussian 
diffuse target. 

In the original problem of Cunningham and Hynd it was found that the hori- 
zontal and vertical components of the point of aim were reasonably independent. 
Consequently we shall assume independence between the values of the uth 
coordinate for the n predictions and values of the vth coordinate, where » # v = 
1, 2,---, k. The generalization omitting this independence provides little addi- 
tional complication. 

For each value of yu, let the set {X,;} have a Gaussian distribution with means 
{m,<} and covariance matrix || o{ || which is positive definite. Because we shall 
want the probability density function for any subset of the predictions, we in- 
troduce the following notation for the inverse of the covariance matrix cor- 
responding to a subset. For a typical subset (7; , a2, ---, 7) of the integers 
(1, 2, --- , n) we shall use the symbol 8, . Then if p, q range over this subset, we 
have 


< (m) jj-1 _ || pe 
(2.1) lope Il = || ofr || 


as the inverse of the covariance matrix for the uth coordinates of the subset 8, 
of the predictions. Therefore they will have probability density element 


| pa jt 
Q< OB ,(u) | ' 
(2.2) aay exp [- 3} _ oF wy (typ ne Muyp) (ye = Mya) }] II AXyp « 


p.geBy pes 


Let the success function of the ith prediction have the following form: 


(2.3) s(&) = C; exp [—4 Lo rt tute), 


where 0 < C; < 1, |! 7% || is positive definite, and yu, v range over the set {1, 

, k}. There is no essential restriction in assuming that the success function 
is centered at the origin, since a change of origin in each k-dimensional space to 
center the success functions would only adjust the values m,,; . 





250 D. Ae S. FRASER 


3. Probabilities from expectations. To describe the distribution of R we need 
the probabilities of 0, 1, 2, --- , n successful predictions, that is, Pr{R = r} 
for r = 0, 1, --- , n. These will not be given in the main theorems, but rather 
an expression for EZ, defined below from which the probabilities can be calculated 
by well known formulas which are given in Theorem 1. 


(3.1) B= 2, Bestgt 


ip<eee<bp 
oes a Es,, 
Br 


where the summation is over all sets of r integers chosen from the set (1, 2, --- , 
n). E;,i....c, i8 the probability that predictions 7, , i2, --- , 7, will be successful 
predictions. Z, can be interpreted as the expected number of sets of r successful 
predictions, counted with overiapping, in our series of n predictions. This is 
easily seen since EF, is the sum of the probabilities for all the possible sets of r 
predictions. 

THeEoreEM 1. /f E, is defined by equation (3.1) and following, then the probability 
of 0 successes is 


r 


(3.2) Pr{R = 0} =1-—£,+ E.— --- + (-1)"E,, 


and the probability of r successes is 


(3.3) Pr{R r} = 5 ree ¢ + D! Bas + oo + (—1) 


+ seas + (—1)"” — 


= E, -("t 7 Eos + +++ + (-y'("t ‘) Ess 


r 


$eee $+ (-1)"" bs 


r 


These are well known formulas of probability theory. 


4. The main theorem. 


THEOREM 2. Given that the success functions are Gaussian diffuse as given by 
(2.3), and that the n predictions have a Gaussian joint distribution as given by 
(2.1) and (2.2), then 
(4.1) Es, = (TICs) | dpSpv + ork) |? 

Pp 


paT(p) | 


when all the m,; = 0, and a more general formula is given by (4.4) and (4.5) below. 





HIT PROBABILITIES 


Proor. Consider the following expression for Eg,: 
Es, = Pr{predictions 7; , 2, --- , t, will be successful predictions} 
E{ II [C, exp (-4 x 1p) TupTrp)] } 


II | ob eu)| : 
= (Il Cy) ea 


(Qe) iz yar /2 


| exp [-3 2s (7X5) 5 pq Lup Lrg + ob eu) Sue(Zup — Myy) (Sq — ne Co 


A , , 
= 1) LAL, f exp (tty Ty +  — m)‘AW — mi] dy 
The matrices in the last expression are defined by 


(4.2) A = || own |, 


(4.3) T = || 105550 |l, 


y oe || tue Il, 
m = || mp ||. 


The matrices are kr by kr or kr by 1, with p, p indicating rows and », q indicating 
columns. 


= |Aap 
(Il ¢, ) Ge (2r)*r?2 
Se + A)y — y'Am — m'Ay + m’A(T + A)*Am}] dy 
-exp [—}{m’Am —m’A(T + A)™Am}]. 
We have completed the quadratic form by removing an appropriate factor from 
under the sign of ae ee over the whole space, we find 
4(m’ —m'A(A+T)"A 
= (II ¢, ) aa + Ty exp [—3(m’Am — m’A(A + T) m)] 
(4.4) di Cy) | I + AT |* exp [—43m’Bm] 
Pp 
(II C,) | Spq5ur + ope Tt) [ * exp [—3 p Mup By,p,».¢ Mog); 
Pp Me? DT 


where 
B = || By.p.».¢ || 

=A-—A(A+T)"A 

= Afl — (I+ ATTY] 
$2 8u0 IIL || SpeBue |] — |] Spear + ope 7%) II) 


(4.5) 


ll o 





252 D. A. S. FRASER 


5. Simplifications. There are two important cases given by Theorem 3 and 
its corollary in which we obtain a simplification in the formula for Eg, . 

THEOREM 3. Given the conditions of Theorem 2 and the condition that ||r{7\|| is 
diagonal for each i, the following expression is obtained for Es, : 


(5.1) Es, = ( IIc,) Il | Spg + ope te) 
Pp oe 


when m,; = 0. 

Proor. We note that || 7%), = rd,» || and hence the determinant in 
(4.1) consists of diagonal blocks with zeros elsewhere. When we expand, (5.1) 
is obtained. 

Coro.uary. Assuming the conditions of Theorem 2 and the condition that || 7) 


has for each i the same principal axes and || a5’ || = || 0; || independent of u, 
then 


Es, = (11 Cy) ID} 8p + eed%oy | 
Pp ~ 

when my; = 0. {dio} are the characteristic roots of the matrix | 7%) || and the 
superscripts y yield corresponding roots in the k-dimensional spaces for the n values 
of 1. 

The proof is obtained by rotating each k-dimensional space to diagonalize the 
matrices. The same rotation will diagonalize for each i. Because |! 0; || is in- 
dependent of » the covariance matrix for the predictions will be unchanged. 


6. Limiting distribution (number of predictions n — ~). Because the ex- 
pression for E, is a sum of (>) terms, the numerical calculations for large values 
of n would be prodigious. Consequently we introduce several limiting distribu- 
tions obtained by letting n increase indefinitely, subject to suitable conditions. 
The limiting conditions in each case should indicate the applicability in particular 
situations. 

Concerning the question of the existence of the different limiting distributions, 
a sufficient condition would be the convergence of the series for Pr{R = r}, 


PrfR=r}=> (-'(’ . \E 


s=0 
obtained by having 


; riE 
(6.1 lim — <1 
rc «COM 
for some value of m. 
When n becomes large, the enumeration of predictions is unwieldy. Therefore, 
they will be given in terms of time, a convenient parameter for intuitive con- 
sideration. Thus we write 


oe. _& 
Ci; —_ O(t5,t;) ’ 





HIT PROBABILITIES 253 


where this is the covariance between the uth coordinates of the predictions at 
times ¢; and ¢;. Also 


uP uy 
T(t) = T(t)» 
an element of the success function matrix at time ¢; . 
TuHEeoreM 4. Type I. Assuming the conditions of Theorem 2 and letting n — «© 


so that the predictions are uniformly spaced from 0 to T and the success functions 
approach 0 as 1/n with D(t) = nC(t) bounded and independent of n, then 


l T Tr r 
62) B= se f --f (I Dit) ) Brau + ofc) Hi [? dts dla +++ dtp, 
0 


where m,; = 
Proor. The minimum value of | dpg¢,. + o(%).1.P%i,) |, Which is the de- 
terminant of a positive definite matrix with 1’s added down the diagonal, will 
be greater than 1. If in addition D(t) is bounded, we have 
i rie, <1 
im. (ae S 1, 
r+ [sup D(d)] 


and this is sufficient to assure the existence of the limiting distribution. 


‘ : Di(ty) \ | ‘ i 
E, = lim Z (IT =) | Spa Oye + oy, t@) TKty) | : 
\\ Pp | 


n-~-o 8, n 


lim (= W/m) +++ d= r/n) 


no | 


(II Det) | bres + oP yrti,y {7 
Pp 


all pili intone n(n ane 1) “ey (n — ff +1). ra 
r from n 


1 T T r F \-3 r 
> [ Pe [ (I Dii)) | Seqdue + oler.te Mt | LI dtp. 
$ “0 p=1 | \ 


p=1 


This completes the proof. 
The applicability of this distribution as an approximation for large values of 
n will be discussed for the Cunningham and Hynd problem in a later paper. 
TueoreM 5. Type Il. Assuming the conditions of Theorem 2 and lettingn — ~ 
and the scale of the success functions decrease such that r%i;, = nee), then 


-T T r r 
(63) Ep=—.[ - [ ( II C() | Te) “YT lot! IT dt, 
Tr: 0 0 p=1 


J “ p=! 


if m,; = 0, |743! is bounded from 0, and 
J o 9 | ) , 


T r 
/ | IL! O (tte) + TI ae, 
0 0 » p=1 





254 D. A. S. FRASER 


exists and <m’ where m is independent of r. 
Proor. The existence of the limiting distribution is guaranteed by these last 


conditions, which insure that the set {E,} satisfies the condition given by formula 
(6.1). 


E, = lim >> {d] C(ty)) | Spa Sur + o¢2).04) nee, | ‘ 
a— Pp 


o 6, ») 
Kin 2 { (LY C()) 1D | oft I) | 82705" bun + rcipy n? Bne [Y 
—e P Mu 


B; 
(II C@)) CIT | ote 1) TD a Pa + 0@™")) 
lim sihasiicedeeanth- sang ahecaninsniacalta at lamsgeiameneiaabeailbeticees 


r 
no By, | n 


T T r 
1.) | 7B it (os) mt 
Tr | eee [ I (C (ty) T (tp) | ) II | T (ty tg) | a dty . 


This proves the theorem. 
TueoreM 6. Type III. Assuming the conditions of Theorem 2 and lettingn — » 


with the scale of the success function increasing, and its density at any trial decreasing 
according to 


ee ‘uy —a® 
(a) rey = Tcyn 


, 


(b) C® = “DEO, 


then 


; 1f 1" F 
(6.4) E,==|4f[ Do al, 
r!LT Jo 


where m,; = 0. 

The proof of this theorem is similar to that of Theorem 4. Note that the con- 
dition for a limiting distribution is satisfied so long as D(t) is bounded. 

This distribution is the Poisson distribution with the usual Poisson parameter 


1 of? 
m= ol D(t) dt. 
T Jo 


7. Limiting distributions for a fixed number of predictions n. 

THEOREM 7. When the scale of the success function increases, the distribution 
approaches the simple generalization of the binomial where the probability of success- 
ful predictions need not be the same for each trial, and 
(7.1) E,= DT II ¢,, 


Br peBr 
where m,; = 0. 


Tueorem 8. When the correlation between the values of particular coordinates of 
X; approaches 0, then the simple binomial generalization is obtained, with 


(7.2) E, = > IT {Cp | 8p + 0 8705) 1}. 


Be pepe 





HIT PROBABILITIES 255 


THEOREM 9. When the correlation between the values at different trials of particular 
coordinates of X ; approaches 1, then 


(7.3) E, = x (7 Cy) | Snede + Vo® Vo %”, (4. 
The proofs for Theorems 7, 8, and 9 follow routine lines. 


8. Time average for a fixed number of predictions n. If the conditions for our 
generalized distribution vary with time, then an expression for the probabilities 
obtained as a time average would be appropriate. 

THEOREM 10. Assuming the time interval between predictions is h and that the 
first prediction occurs at an undetermined time in the interval (0, T’), then the 
general distribution has its probabilities determined by 


oe srt ay 
(8.1) Z, = 2» af ; (I Clit + ph) ) | bpo Sux + oe eph,t+eh) Te+pny | * dt. 
r ie Pp 


Proor. Assuming that the time of the first prediction is uniformly distributed 
on the interval (0, 7’), (8.1) follows from Theorem 2. 


9. Acknowledgement. The author wishes to express his appreciation to Pro- 
fessor S. S. Wilks, under whose guidance this paper was prepared, and to thank 
Professor John W. Tukey for valuable discussions of the problem. 


REFERENCE 


[1] L. B. C. CUNNINGHAM AND W. R. B. Hynp, ‘“‘Random processes in air warfare,’’ Jour. 
Roy. Stat. Soc., Suppl., Vol. 8 (1946), p. 62-85 





ESTIMATION OF PARAMETERS IN TRUNCATED PEARSON 
FREQUENCY DISTRIBUTIONS 


By A. C. CoHEN, JR. 
The University of Georgia 


1. Introduction and summary. A method based on higher moments is pre- 
sented in this paper by which the type of a univariate Pearson frequency dis- 
tribution (population) can be determined and its parameters estimated from 
truncated samples with known points of truncation and an unknown number of 
missing observations. Estimating equations applicable to the four-parameter 
distributions involve the first six moments of a doubly truncated sample or the 
first five moments of a singly truncated sample. When the number of parameters 
to be estimated is reduced, there is a corresponding reduction in the order of the 
sample moments required. A sample is described as singly or doubly truncated 
according to whether one or both “tails” are missing. Estimates obtained by 
the method of this paper enjoy the property of being consistent and they are 
relatively simple to calculate in practice. They should be satisfactory for (a) 
rough estimation, (b) graduation, and (c) first approximations on which to base 
iterations to maximum likelihood estimates. 

Previous investigations of truncated univariate distributions include studies 
of truncated normal distributions by Pearson and Lee [1], [2], Fisher [3], Stevens 
[4], Cochran [5], Ipsen [6], Hald [7], and this writer [8], [9]. In addition, the 
truncated binomial distribution has been studied by Finney [10], and the trun- 
cated Type III distribution by this writer [11]. 


2. Complete distributions. The Pearson system of frequency curves has its 
genesis in the differential equation 


(1) Se eee 

f(x) dz bo + bia + bor? 

where. the origin is arbitrarily taken. Since we are concerned with truncated 
distributions, it is convenient to take the origin at the left terminus. In the 
derivations which follow we regard a, by , b; , and b. as primary characterizing 
parameters of the distributions studied. The mean, standard deviation, a; , 
and a, are expressed as functions of these quantities. To obtain a moment 
recursion formula for the general Pearson (complete) function, f(x), we separate 
the variables of (1), multiply both sides of the resulting equation by z*, and 
integrate over the full range of permissible values of x, i.e., 7 << x < s. Thereby 
we obtain 


(2) [ (bo + bit + bo2”)a" df(z) = [ (a — x)a'f(a) dz. 


256 





TRUNCATED PEARSON DISTRIBUTIONS 


Let the kth moment of the complete (population) distribution function, f(z), 
about the origin selected (i.e., about the left terminus) be designated as 


(3) ua = [ a*f(x) de, 


and the right member of (2) becomes 
; , 
Que — Mey. 
Upon integrating the left member of (2) by parts, we obtain 
[(bp + bie + bex*)ar*f(x)]i — kbousa — (kK — Ld — (k + 2)urar. 


The Pearson system includes only those solutions of (1) for which f(r) = f(s) = 0, 
and moreover only those for which the left member of the above expression 
vanishes at both limits. As a consequence of these restrictions, we may com- 
bine the left and right members above, to obtain the following recursion formula 
for moments of the complete distribution about the origin: 


(4) hux + bokuir + bikup + bo(k + 2)uiar = wear, 
where we have written 
(5) h =a aad by ° 


If f(x) is normalized so that uo = 1, and we let k = 0, 1, 2, and 3, successively, 
in (4), the resulting system of equations may be written as 
(2be — 1); —h, 
b; — h ) pty + (3b. = L) peo — bo 9 


(6) 


our + (2b: + hws + (4b: — Dus 0, 
Sbous + (3b, + h)ws + (5be — Ly = O. 


On solving (6) for moments of the complete distribution we obtain 


u; = h/(1 — 2bs), 


/ 


, My; . 
Me [do + (by oe h)ws|/(1 = 3be), 
; j % 
a3 = [2bou, + (2b; + h)pel/(1 — 4b), 
' ; : ‘3 
My = [3dou2 + (3b; + h)us\/( ™ 5be). 
With h, bo , bs , and be known, it is a simple matter to determine 4; , ue, us, 
and wy from equations (7) in the order named. These equations might, of course, 
° . , , , . . . . . 
be rewritten with yu; , w2, and yw; entirely eliminated from the right members. 
However, when this is done they become too complex in structure to be of 
° . , ’ ' es et 7 
practical value. After calculating 4; , 42, us, and wy from (7), corresponding 
central moments can be determined from the well known translation formula 


= /k\ + 7 He 
(8) ue = » (*) bei (ur), 


i=O 


(7) 





258 A. C. COHEN, JR. 


and the standard moments from 
(9) oa, = p/o*, 


where o = u,. The second central moment then becomes 


1 h ( hbe 
10 di: ssa sas call ; 
(10) bs ed ee — 


Similar formulas can also be written for uw; and 4, but they are too unwieldy 
to be useful. For each practical application, it seems preferable to compute 
noncentral moments about the left terminus from (7). Central and standard 
moments, as required, can then be obtained from (8) and (9). 

If we designate the left truncation point in standard units of the complete 
distribution by ¢’, we have ¢’ = (0 — y;)/o and thus 


(11) 1 = —ot’. 


Although formulas expressing the mean, standard deviation, a;, and a, 
explicitly as functions of a, bo, b;, and be are unduly complex for the four- 
parameter distributions, as shown below they become relatively simple for 
Type III and Normal distributions. 

Type III distribution. In this case b. = 0, and we have 


wi = h, h = et’, 
(12) o = Vbo + bih, by = ca;/2, 
as = 2bi//bo + bh, =o = [1 + £'0/2]. 
Normal distribution. In this case b; = b: = 0, and 


Hi =h, h= —of’, 
(13) 


2 


a= Vb, b= oc. 


3. Recursion formula for moments of incomplete distributions. If the limits 
of integration in equation (2) are reduced to include only the truncated range 
0 < xz < d, wherer < O andd < s, we have 


(14) . (bb + biz + by x’)2* df(z) = [ (a — x)zx‘f(x) dz. 


Define the kth moment of the truncated distribution about the left terminus as 


(15) m, = [ 21 dz, 


with mp = 1, and the right member of (14) becomes 


amy, — Meri. 
On integrating the left member of (14) by parts we obtain 
[(bo + bit + bex”)a*f(x)}o — kbomu — (k + 1)bym . 





TRUNCATED PEARSON DISTRIBUTIONS 259 


Since we are not integrating over the full range of z, the first term of the above 


expression does not vanish as it did with the complete distribution. However, 
if we define 


F, = f(O)bo > 
F = f(d)[bo + db, + d’bu], 


and then combine left and right members above, we obtain the following recur- 
sion formula for moments of the truncated distribution: 


(16) 


(17) hm, + bokmya + dikme, + bo(k + 2)migs — AF = megs (kK > 1). 


If we let k = O in (14) prior to integrating, and then proceed as outlined above, 
we obtain 


(18) h + 2mbe + F, —-F= m , 


which may be regarded as a companion equation to (17) for the case k = 0. 


4. Estimating h, b), b:, and b. from doubly truncated samples. To obtain 
estimates by equating observed sample moments to population moments, we 
substitute the observed sample moments » for the m, in (17), simultaneously 
replacing the population parameters h, b;,--- , and F by their estimates h*, 
b>; ,-::, F*. Setting k = 1, 2,---, 5, successively, we find the estimating 
equations 


vh* + bo + vnbt + 3nb3 — dF* = », 
voh* + 2nbo + 2vebl + 4b. — dF* = »;, 
vyh* + 3veb9 + 3vsb) + 5nd? — dF* = w%, 
wh* + Avsbo + 4b + 6vsb? — d‘F* = »,, 
vsh* + 5vgdp + Svsbt + Tuebr — dF* = vw. 


These constitute a linear system of five equations in the five estimates, h*, be , 
by , bs , and F*, which can be solved by any of the standard methods applicable 
to such systems. For practical applications, the writer suggests using either 
the method of ‘‘single division” or ‘multiplication and subtraction” as described 
by Dwyer [12]. With estimates h*, bo , b; , and b; thus calculated, we substitute 
these values in (7) to estimate moments of the complete distribution, and sub- 
sequently compute the required estimates of population (complete distribution) 
parameters with the aid of (8) and (9). FY can be computed from (18) upon re- 
placing parameters by their estimates and m, by » . It will be noted that these 
estimates are consistent since if they should be calculated from the entire popula- 
tion they would obviously equal the required parameters. 

Although neither Fy? nor F* is required in estimating moments of the com- 
plete distribution, a comparison of their values found on solving (18) and (19) 
with corresponding values computed from the finally fitted curve with the 





260 A. C. COHEN, JR. 


aid of (16) affords a check on agreement between the fitted curve and observed 
sample data. 

It should be noted here that estimates are distinguished from parameters 
throughout this paper by starring (*) the estimates. 


5. Determining type of distribution. With estimates of u;, 7, a;, and ay 
computed as indicated in Section 4, the type of the distribution involved can 
be established from the original Pearson criteria, an excellent exposition of 
which has been given by Elderton [13], or from the Carver-Craig criteria [14]. 
In the present instance, however, since estimates of bo , b; , and be must of neces- 
sity be computed before estimates of the population parameters can be ob- 
tained, it seems more appropriate to determine the type directly from the quad- 
ratic equation 
(20) bo + bia + box” = 0. 

The general solution of the differential equation (1) can be written as 
(21) f(x) = C(x — n)"(r2 — x)”, 


where r; and rz are roots of (20) (ef., for example, {[14]). The nature of these roots 
determines the type of the distribution. If we let D designate the discriminant, 
D = bj — 4bob., the principal Pearson curves’ may be classified as follows: 
Type I n—-m<0<n—m, D> 0; 
I(r; — ws) = —(re — ws), br = 2d, D> 0; 
III bo = 0 
IV r, and re imaginary, b; + Qdoy : D < 0; 
V r:1 — py), (re — ws) of the same sign, D> 0; 
VI, and ry imaginary, b; = 2beu; , D <0; 
Normal b, = b = 0. 


It can be shown that a necessary condition for the odd central moments to 
equal zero (i.e., for f(x) to be symmetrical about is mean) is that 


by = 2Zbop ° 


6. Singly truncated samples. If only the left tail is omitted, then F vanishes, 
and we can drop from (19) the last equation, which would otherwise be required, 
after placing F* = 0 in the remaining equations. If only the right tail is missing, 
then F; = 0, and by changing the variable from z to d — zx we can translate the 
origin to the truncation point on the right, set FY = 0, and again drop the last 
equation otherwise required in (19). As an alternate and frequently preferable 


1 The numbering of the types followed here is that of Craig [14]. 





TRUNCATED PEARSON DISTRIBUTIONS 261 


procedure when some origin other than the truncation point of a singly truncated 
sample has been employed, we might substitute (18) for the last equation of 
(19) after replacing parameters by their estimates. In both instances, the order 
of the highest order sample moment required is reduced by one from the re- 
quirements for doubly truncated samples. 

In practical applications, finding either F{ or F* equal or almost equal to 
zero from a sample that is represented as being doubly truncated, suggests that 
perhaps the sample was in fact only singly truncated. In this case, either the 
sample terminus is the terminus of the complete distribution or the absence of 
lower sample values is due to the small probability associated with their oc- 
currence. Finding both Ff and F* equal or nearly equal to zero suggests that 
the sample was not truncated after all, and that the necessary estimates should 
be computed from estimating equations applicable to complete samples. 

When the sample terminus is employed as an estimate of the corresponding 
population terminus, an additional equation may be dropped from (19) since 
in this case we are estimating one less parameter from the moment equations. 
To illustrate, consider a Type III distribution for which the left sample terminus 
(origin) is considered as an appropriate estimate of the population lower limit. 
We then have 

h= 20/a3 , 


and from (12) 
h = (bo a byh)/by . 


Consequently it follows that b) = 0, and the system of estimating equations to 
be solved consists of the first two equations of (19) plus (18) with bs = bs = 0. 
The parameters appearing in (18) are of course replaced by their estimates. 


7. Type III and normal distributions. When it is desired to estimate param- 
eters of a Type III distribution (for which b. = 0) from a doubly truncated sam- 
ple, we need calculate only the first five sample moments and solve the first 
four equations of (19) after placing b; = 0. With singly truncated samples from 
which the left tail is missing, we require only the first four sample moments and 
need solve only the first three equations of (19) after setting F* = 0. 

To estimate parameters of a normal distribution (for which b; = be = 0) 
from doubly truncated samples, we calculate the first four sample moments and 
solve the first three equations of (19) after setting bf = b; = 0. With singly 
truncated samples from which the left tail is missing, we require only the first 


three sample moments and need solve only the first two equations of (19) after 
setting F* = 0. 


8. A numerical example. To illustrate the application of results obtained in 
this paper to practical problems, we consider an example given by Miss Shook 
[15] on the weights of 1000 female students (cf. Table 1). Miss Shook considered 
her data as a complete (untruncated) sample from a Pearson Type III popula- 





TABLE 1 
Weights of 1000 female students 





Graduated frequencies based on ‘Type III 
Cbdieied distribution 
Weight in pounds fre- 
quency | Complete | Truncated 
sample on right 


Limit at 
sample 
terminus 


Doubly 
truncated 


70— 79.9 a 2 0. 


o— | — 





80— 89.9 
90- 99.9 
100-109. ¢ 
110-119. 
120-129. 196 
130-139.‘ 122 
140-149. § 63 
150-159.§ 23 
160-169 .9 2 
170-179.9 7 
180-189 .9 l 
190-199 .9 





» 
200-209 .9 I 
210-219.9 ] 


Total 


L000 995.3 1002. § 995.0 


Total frequency in 
truncated range. . 980.9 981. 980.6 


M* (\bs.) 8.74 118.55 119.14 118.56 


a* (Ibs.) 9175 | 16.027 16.958 16.024 


as 976424 0.655 | 0.865 0.657 


69.61 9.95 | 69.77 


Lower limit (lbs.). . 


pt (from sample moments). . . | —0.002 
' (from fitted curve) 0.006 | 0.005 


»x (from sample moments) 0.767 .398 0.769 
(from fitted curve) 0.732 1.183 0.733 


Truncated sample obtained by truncating the complete sample on the left 
at 79.95 lbs. and on the right at 159.95 Ibs. 


262 





TRUNCATED PEARSON DISTRIBUTIONS 263 


tion, and employed the method of moments to estimate population parameters. 
Using these estimates, she then graduated the observed sample data. 

For our purposes, we truncate Miss Shook’s sample on the left at 79.95 Ibs. 
and on the right at 159.95 lbs., thus eliminating the first and the last six cells 
of the grouped data. The retained (truncated) sample then consists of 981 ob- 
servations, all of which are within the range 79.95 to 159.95 lbs. We disregard 
all prior knowledge about the type of the population, and accordingly compute 
the first six sample moments about the lower terminus. In order to compensate 
for moment errors due to grouping, we apply Sheppard’s’ corrections for non- 
central moments. Both sets of moments are given below. 


Uncorrected moments Corrected moments 

n, = (7.56676860) 5 vy, = (7.56676860) 5 

Ne (66.4026504) 5? Vo (66.0693171) 5? 
ns = (649.817533) 5° v3 = (642.250764) 5 
Ns (6913.71764) 54 M% (6781.37901) 54 
ns = (78479.9827) 5° vs = (76331.5834) 5° 
neg (937015.638) 5° V6 (902910.393) 5° 


We substitute these values in (19) and solve the system by Dwyer’s method 
of multiplication and subtraction to obtain h* = 44.973178, bs = —53.5929, 
br = 12.339508, b* = —0.084107, and F* = 0.578321. From (18) we then 
obtain F? = —0.196817. The small negative value thus computed suggests 


that perhaps F actually has the value zero and that there was no truncation on 
the left. 

Considering the sample as being truncated on the right only, and rather than 
translate our origin to the right sample terminus, we substitute (18) with F] = 0 
for the last equation of (19) to obtain a new system of five equations in the same 
five unknown estimates as before, but involving only the first five sample 
moments. On solving the new set of equations, we obtain h* = 38.530928, 
bo = 54.83444, bf = 5.179891, bs = 0.000986, and F* = 0.771707. 

The small values obtained for b3 in both the above cases lead us to conjecture 
that be actually has the value zero, and that our sample came from a Type III 
population. 

With the sample considered as coming from a Type III population and as 
being truncated on the right only, we solve the system consisting of the first 
three equations of (19) plus (18) with b: = FT = 0, and obtain h* = 38.600670, 
be = 54.1194, bf = 5.247727, and F* = 0.766827. On substituting these values 
in (12) we have y:* = 38.60 lbs., o* = 16.027 lbs., and a; = 0.655. The mean 
referred to zero as an origin is estimated as M* = y;* + 79.95 lbs. = 118.55 lbs. 
The corresponding estimate of the lower limit is 69.61 lbs. A graduation of the 
sample data using these estimates and carried out with the aid of Salvosa’s 


2 See for example reference [16], formula 27.9.3, page 361. 





264 A. C. COHEN, JR. 


tables [17] is given in Table 1, along with Miss Shook’s original graduation 
which was based on estimates from the complete sample. 

To provide additional comparisons, we compute further estimates with the 
sample assumed to be doubly truncated from a Type III population. Accord- 
ingly, we solve the first four equations of (19) with bf = 0 to obtain h* = 
38.605540, bo = 53.5835, bf = 5.262710, and F* = 0.769439. From (18) we find 
FY = —0.002258. Similarly, we calculate an additional set of estimates under 
the assumption that the sample was truncated on the right only but with the 
left sample terminus being the lower limit of the complete distribution. In this 
case, the system of three equations consisting of the first two equations of (19) 
plus (18) with bs = b = 0 yields the solutions h* = 39.191957, bf = 7.337336, 
and F* = 1.358114. Estimates of the basic population parameters for each of 
the above cases, along with graduations over the complete sample range, are 
also included in Table 1. 

The agreement between observed and graduated frequencies is found to be 
much better for estimates based on the truncated sample than for estimates 
based on the complete sample. The improved results obtained with the truncated 
sample suggest that perhaps some of the extreme observations in Miss Shook’s 
original data came from a different population than that which accounted for 
the main body of her data. It makes little difference whether the truncated 
sample is considered as being singly or doubly truncated or whether the left 
sample terminus is used as an estimate of the population lower limit or not. It 
will be also noted that the values of Ff and F* as computed from the finally 
fitted curves with the aid of (16) are in substantially close agreement with the 
corresponding values found on solving the moment equations. In each case the 
graduations are very nearly equal throughout the entire sample range, and any 
one of the truncated sample graduations would be considered as a satisfactory 
fit to the observed dati. over the truncated range. Certainly any one of the 
three sets of estimates would, for this example, be adequate as first approxima- 
tions on which to base iterations to maximum likelihood estimates in a manner 
similar to that previously employed by Koshal [18] in improving moment esti- 
mates from complete samples. The writer hopes to give further consideration 
in a subsequent paper to the problems of such iterations when samples are 
truncated. 

REFERENCES 

[1] Karu Pearson anv Auice Leg, ‘‘On the generalized probable error in multiple normal 
correlation,’’ Biometrika, Vol. 6 (1908), pp. 59-68. 

[2] Auicp Leg, ‘Table of Gaussian ‘tail’ functions when the ‘tail’ is larger than 
the body,’’ Biometrika, Vol. 10 (1915), pp. 208-215. 

[3] R. A. Fisuer, “Properties and applications of H, functions,’’ Mathematical Tables, 
Vol. 1, pp. xxvi-xxxv, British Association for the Advancement of Science, 1931. 

[4] W. L. Stevens, ‘‘The truncated normal distribution,’’ Appendix to ‘“‘The calculation 


of the time-mortality curve’ by C. I. Bliss, Annals of Applied Biology, 
Vol. 24 (1937), pp. 815-852. 





TRUNCATED PEARSON DISTRIBUTIONS 265 


[5] W. G. Cocuran, ‘‘Use of IBM equipment in an investigation of the truncated normal 
problem,’’ Proc. Research Forum, International Business Machines Corp., 1946, 
pp. 40-43. 

JOHANNES Ipsen, Jr., “A practical method of estimating the mean and standard 
deviation of truncated normal distributions,’? Human Biology, Vol. 21 (1949), pp. 
1-16. 

A. Hatp, ‘‘Maximum likelihood estimation of the parameters of a normal distribution 
which is truncated at a known point,’”’ Skandinavisk Aktuarietidskrift, Haft 3-4 
(1949), pp. 119-134. 

A. C. Couen, Jr., ‘On estimating the mean and standard deviation of truncated 
normal distributions,’’ Jour. Am. Stat. Assn., Vol. 44 (1949), pp. 518-525. 

A. C. Conen, Jr., “Estimating the mean and variance of normal populations from 
singly truncated and doubly truncated samples,’’ Annals of Math. Stat., Vol. 21 
(1950), pp. 557-569. 

D. J. Finney, ‘‘The truncated binomial distribution,’? Annals of Eugenics, Vol. 14 
(1949), pp. 319-328. 

A. C. Counen, Jr., “Estimating parameters of Pearson Type III populations from 
truncated samples,’’ Jour. Am. Stat. Assn., Vol. 45 (1950), pp. 411-423. 

P.S. Dwyer, ‘‘The solution of simultaneous equations,’’ Psychometrika, Vol. 6 (1941), 
pp. 101-129. 

W. P. E_perton, Frequency Curves and Correlation, 3rd ed., Cambridge University 
Press, 1938. 

C. C. Crate, “A new exposition and chart for the Pearson system of fre- 
quency curves,’”’ Annals of Math. Stat., Vol. 7 (1936), pp. 16-28. 

B. L. SHoox, ‘Synopsis of elementary mathematical statistics,’’ Annals of Math. 
Stat., Vol. 1 (1930), pp. 14-41. 

H. Cramér, Mathematical Methods of Statistics, Princeton University Press, 1946. 


Luts R. Satvosa, ‘Tables of Pearson’s Type III function,’’ Annals of Math. Stat., 
Vol. 1 (1930), appended. 
R. S. Kosuau, “Application of the method of maximum likelihood to the improve- 


ment of curves fitted by the method of moments,”’ Jour. Roy. Stat. Soc., Vol. 6 
(1933), pp. 303-13. 





ese i i AD dre. he 


ON THE DISTRIBUTION OF THE CHARACTERISTIC ROOTS 
OF NORMAL SECOND-MOMENT MATRICES' 


By A. M. Moop 
The Rand Corporation 


1. Summary. Distributions of characteristic roots have been obtained by 
Girshick [1], Fisher [2], Hsu [8], and Roy [4]. The present paper outlines an 
alternative derivation of these distributions which is somewhat more elementary 
than those that have been published and which may have some pedagogical 
utility. The primary object of the paper, however, is to obtain the normalizing 
constants for these distributions; though the correct values of the constants 
have been published in the references cited above, no convincing derivation 
seems to have been recorded. 


2. The problem. Let 
(1) Qi; = Do (tie — ¥i)(Zja — ¥;) (i,j = 1,2,---,k) 
a=l 
be sums of squares and products for samples of size m(>k) from a k-variate 
normal distribution with covariance matrix || ¢;; || (= || o'’ ||“) having a k-fold 
characteristic root \. The a;; are distributed by the Wishart density function 


j} 4(m—1) —k—2) Letic 
») ° 2 ‘j 1g"? j a;; 4(m é bEotias; 
(2) flaj;m— 1,0") = = ae 


ee Il r[3(m — i)} 


i=1 


with m — 1 degrees of freedom. Let b,; be similarly distributed with n — 1 
degrees of freedom and independently of the a;; . 
We are concerned with the distribution of the roots w;,--- , w, of 


(3) | a;; — wo;;| = 0, 


which roots form a natural multivariate analogue of chi-square. Similarly the 
roots of 


(4) |ai;j — vb;;| = 0 
provide an analogue for the variance ratio, and the roots of 


(5) las; — u(ai; + b;;)| = 0 


an analogue for the intraclass correlation. More important, the roots of (5) 


1 This work was done during the academic year 1939-40 when the author was a graduate 
student at Princeton University; it was completed just as the Hsu and Fisher papers ap- 
peared, and was therefore never submitted for publication. Recently the author learned 
from Hotelling that a derivation of the normalizing constants would be of interest. 


266 





DISTRIBUTION OF CHARACTERISTIC ROOTS 267 


are directly related to Hotelling’s canonical correlations [5] for two sets of 
variates. For all these problems it is necessary only to obtain the distribution 
for the roots of (5) since the roots of (4) are 


(6) 


and the distribution of the w; may be obtained by letting n — in the distrib- 
ution of the u;. 


3. Density function for the u; . It is no essential simplification to suppose, as 
we shall do, that 


oj = 4b; | ‘for ¢ = j, 


=@Q for «#). 


The joint density for a;; and );; is 


(8) flai;,m — 1, o’ )f(bi;,n — 1, 6”), 


where f is defined by (2). If ui(uy < ue <--> < uy) are the roots of (5), there 


exists [6] a nonsingular linear transformation || g'’ || such that 
(9) q’ ||" || ass + de; | 
(10) 

(11) | bi; | 


where the prime denotes the transpose. 

We shall transform the k° + k variates a,; and b,; of (8) to the k* + k var- 
iates qi; and u; where 
Pa” 


lai || = |l¢@ 


The transformed density is 


h A(m—k—2) [- 4(n—k—2) 
1 1 


where J is the Jacobian 6(a;; , b;;)/A(u; , qi;) and K, is the normalizing constant. 
We next show that J factors into a function of g;; only and a function of u,; 
only. Let the earlier variables be ordered 


Gy, , Qi2 5 °° * 5» Atk » Gena, Ge3, °° * 5 Ue, Gaz, *** » Aye, *** » ee, 


bu,--: > Oi, bee, +++ , Doe, +++ , Dee, 


and the new variables will be ordered 


Uy, U2, °° * 5 Uk, Ju, Ga, *** » Ukr, Qi2» Gaz, ° 





268 A. M. MOOD 


On differentiating the relations 


(13) aij 2 ir] jrUr ; 


r 


(14) bi; = Dd gig (1 — ue), 

the Jacobian can be written down directly. Supposing this to have been done 
(with the a,; and b;; corresponding to columns and the u; and q;; to rows), the 
result can be simplified by adding the first column of the left half to the first 
column of the right half, the second column of the left half to the second column 
of the right half, etc. The first row of the resulting determinant then has elements 


gin » W11921 » Gis» *** » QiurQei, qu » 921ds1,»°** » Qader, *** » gia 


in the left half, and zeros in the right half. The (k + 1)th row, for example, 
has elements 


2quts , arti, *** 5 Gert, O,0,--- , 0 


in the left half, and the same set with the ~’s omitted in the right half. 

Now we show that J vanishes if u; = u2. It will be easy to follow the argu- 
ment if one writes down the complete Jacobian for k = 3. Assuming wy = u, 
the following steps produce a row of zeros in J: 

1) Multiply the columns of the right half by wu; and subtract from the 
corresponding columns of the left half. This makes the elements 
of the left half of rows k + 1 through 3k all zero. 

Make all elements of the (k + 1)th row zero except the element 
2qu (in the by, column) by subtracting proper multiples of the by 
column from the columns having nonzero elements in that row. 
Make all elements of the (k + 2)th row zero except that in the 
by. column. 

Make all elements of the (k + 3)th row zero except that in the 
bi3 column. 


Make all elements of the (2k)th row zero except that in the by 
column. 
Make all elements of the (2k + 1)th row zero by subtracting proper 
multiples of the k rows above it from that row. 
It follows therefore that J has the factor (w — w) 
Similarly J must have all factors of the form u; — u; ; hence J has the factor 
(15) I] (us — uj), 
mj 


and since J is of total degree k(k — 1)/2 in the w’s the other factor of J must 
involve only the q’s. 





DISTRIBUTION OF CHARACTERISTIC ROOTS 269 


Thus it follows that (12) factors into a function of the g;; only and a function 
of the u; only, say 


k 4(m—k—2) k i(n—k—2) 
Qe) Dw) = Ke {Tu} [Ta-w] w=, 


4. Normalizing constant. Let us define 


a7) Bea) = [P+ [OT wh a = wa us = 0) TT de 


then the normalizing constant of (16) is 
(18) Ky = 1/L{}(m — k — 2), 3(n — k — 2)). 


Our procedure will be to first express L(a, 8) as a multiple of L(O, 0) and then 
to evaluate the latter factor directly. 
In view of (9), (10), and (11), 
(19) Mus = | ai; |/| ais + dis |, 
(20) T1(1 — wi) = | bys |/| aig + das | 5 
hence 
(21) B( Let | bi |" ) “i I{}(m —k—2)+17r,4(n — k — 2) + 8] 
: [ay + by Lim —k—2),in—-k—-2) * 
But this quantity is determinable from (8) by a method due to Wilks [7]. Since 
the elements of || a;; + 0;; || are distributed by 
(22) flai; + bij, m+n — 2, 0”), 
we find first that 
k . 

s ‘ ri(m+n—-1—-0)4+¢] 
25 (| ay +t = |40°\~ ee 
(23) EX ay + by) ae i-1 'is(m+n—1— 2] ’ 
as does Wilks in [7]. Thus 


eee / | ag + bi | flay, m — 1, o)f(biz,n — 1, 04) Ida, Mdb,; 
(24) : ; 
é pot Tp Tem +n-1-d)t+ea 
eee ri(m+n—1—9)] 
or in another form 


/ eee / | ais E Bag |° | ey RO | gg Oe ee osseoly das dbs 
25) 


ghey pp em = ir — orm +n 1-9 +4 


7 | SoM |dlemtn—2) +e - T{3(m + + i =a. . 





A. M. MOOD 


In this expression we replace m by m + 2r and n by n + 2s and multiply the 
whole by 
1 _tj | }(m+n—2 
12 


nr) TT rim — dram —- 2 


to get 


E(| aij + bi; F ay . b, 
26) . 
Tem —i) trritm —)+ slls(m+n—-1- 


+ 
' 


t 
(3 (n 


i=l r[s(m +n —- 1 ty + r+ s|'[d(m - 


1 > \ 
12 


In this we put c = —(r + s) to get an expression for the right side of (21). In 
the resulting relation we put m = n = k + 2 to get 


iPsQk+3—0 +r4+sP hk +2—drkk+2—-% 


(kk +2 —i) + ririk(k +2 — 21) + sir[h(2k + 3 — 2) 
27) L(r,s) = L0,0) [ee +2 — 9) + rae 4 ) + sIPia(2k + 3 — 


Now we are left only with the problem of evaluating 


(28) L(0, 0) = / “ / I] Gs -— wT dui, 


where RF is the region 0 - uy < Us << sos & a < 1. We first observe that 


k 


integrand may be put in the determinantal form 


(30) = > (-1)'°* [J 


where a;, a, °** , @ iS a permutation of 1, 2, --+ k; where the sum is over all 


permutations of these integers; and where {(per) is the numb 
in the permutation. 


ot transpositions 


On integrating (: ver F? it is found that 


(31) LO,0) => 





DISTRIBUTION OF CHARACTERISTIC ROOTS 


It is shown in the Appendix that this sum has the value 

ae ; lyi-j 

32 L 0, 0 = ' ae, 

(32) L0,0) =; IT; + 

—_ -.% (k — 1k — 2)!--- Qt! 
k1 [8-4-5 +--+ (k + 1][5-6 --+ (k + 2)] [7-8 --- (k+ 3)] --- [2k -1] 
k * « 

‘ rik — 1+ ar(2k+ 1 — 2%) 

34 = ss 

~ II r(2k + 1 — 2) 

This may also be put in the form 

“ 1 wrk +2 — dG + 2 — DIT + 1-0) 

35) L(O,0) = — = - me 

= (0, 0) 2 fal r'[3(2k + 3 — 1)| 

The identity of (34) and (35) is easily shown by induction on k employing the 

relation 

(36) PA + 1IPr(h + 3) = Val (2h + 1)/2”. 

The form (35) simplifies the final expression for K; which is found by putting 


(27) and (35) in (18) to get 


(37) K, = at?J ri}(m +n —1— 9) 


Putting (37) in (16) we have the density function for the roots of (5), and 
the densities for the roots of (3) and (4) can then be obtained as stated at the 
end of Section 2. 


APPENDIX 


We wish to demonstrate that if gi = 1, 2,--- , k) are k distinct positive 
quantities indexed in order of magnitude, then 


(—1)*?"” Gi — g sik 
(a) ao gi 2) / di, 
yt ° 9° Gia Bie a s+ + ee Bi) fgg IIo 


where y1 , Y2, °** , Ye iS & permutation of g; , ge, --* , gx, the sum is taken over 
all permutations of the g;, and ¢(per) is the number of transpositions in the 
permutation y:,-°--, yx. This identity was first formulated and proved for 
g; = t with considerable aid from J. B. Rosser. Here we give a different and 
easier argument which handles the more general situation. 

First we obtain another identity as a lemma, namely, 


} : 
(b) Zz Ji II ax o = a Gis 
t=] ig*i gi rae gj t=] 


The following ‘argument for (b) was formulated by John Nash. The left side of 
(b) is a rational function of g, , say P(g:)/Q(gi), which we may suppose to be 





A. M. MOOD 


reduced to its lowest terms. We first argue that the rational function is really a 
polynomial becuase it does not become infinite for any finite value of g; . Cer- 
tainly the only possible roots of Q(g:) are go, --- , gx. Suppose gi = gz + €, 
then the first two terms (the others do not have g; — g2 in their denominators) 
on the left of (b) may be written 


+l +o eter , oa. aI ete|, 


7. we — 93 


which is clearly bounded as ¢« — 0. Similarly no other g; is a root of Q(g:); hence 
the left side of (b) is a polynomial P(g,) in g,. Now let g; become large; the 


first term on the left of (b) becomes essentially g,; while the others become con- 
stants; hence 


Pin) = gi + Cie , 72° 5 Oa). 
Similarly as a function of go the left of (b) is of the form 
92 + Calg, gay °** 5 Ge), 
and so forth. Furthermore, the left of (b) is homogeneous of degree one in the 
g’s; it must therefore be >-i gi: . 

Having (b) we can prove (a) by induction. It is true for k = 2, and we shall 
show it to be true for k + 1 given it to be true for k. Applying (a) to the left 
side of the following relation, we have 

(— ] y per) 
per Yi(Yi 1 Y2) (Yr + Y2 + Ys) oo * (Yr + Y2 + +e* 1 Yess) 


ye ot [ecu] ene 


per mY TY IYI Yet: Yew’ 
¥1<Y2< ered Vk Yi 
1 


where the sum on the right is over all permutations which have y,; < yz < 
Ye Kees € This means that the sum has only k + 1 terms; these terms 
arise from anaes Yeu equal to gi, g2,°°* , Gea: In turn and arranging the 
other g’s in ascending order. Thus the right side of this last relation may be 
written 


Ok+1 gi k+1 
v=91 a is gi +r - =] I] 
i Ji Sa] 


-((2=*2) /To|[ Eo (te 


isi Ji — Qi 


and the final bracket is unity in view of (b 





DISTRIBUTION OF CHARACTERISTIC ROOTS 273 


REFERENCES 


{1} M. A. Grrsuics, ‘On the sampling theory of the roots of determinantal equations,” 
Annals of Math. Stat., Vol. 10 (1939), pp. 203-224. 

(2) R. A. Fisuer, ‘“‘The sampling distribution of some statistics obtained from non-linear 
equations,” Annals of Eugenics, Vol. 9 (1939), pp. 238-249. 

(3] P. L. Hsu, “On the distribution of roots of certain determinantal equations,’’ Annals 
of Eugenics, Vol. 9 (1939), pp. 250-258. 

[4] S. N. Roy, “‘p-statistics, or some generalizations on the analysis of variance appropriate 
to multivariate problems,’’ Sankhyd, Vol. 3 (1939), pp. 341-396. 

(5) H. Hore.urne, ‘Relations between two sets of variates,’’ Biometrika, Vol. 28 (1936), 
pp. 321-377. 

[6] M. BécueEr, Introduction to Higher Algebra, Macmillan, 1929, p. 171. 

[7] S. S. Wixxs, ‘“‘Certain generalizations in the analysis of variance,’’ Biometrika, Vol. 24 
(1932), pp. 471-494. 





A BIVARIATE EXTENSION OF THE U STATISTIC! 


By D. R. Wuitrney 


Ohio State University 


1. Summary. Let x, y, and z be three random variables with continuous 
cumulative distribution functions f, g, and h. In order to test the hypothesis 
f = g = h under certain alternatives two statistics U, V based on ranks are 
proposed. 

Recurrence relations are given for determining the probability of a given 
(U, V) in a sample of | 2’s, m y’s, n z’s and the different moments of the joint 
distribution of U and V. The means, second, and fourth moments of the joint 
distribution are given explicitly and the limit distribution is shown to be normal. 

As an illustration the joint distribution of U, V is given for the case (1, m,n) = 
(6, 3,3) together with the values obtained by using the bivariate normal approxi- 
mation. Tables of the joint cumulative distribution of U, V have been prepared 
for all cases where 1+ m+n S 15. 


2. Introduction. Let x, y, and z be three random variables with continuous 
cumulative distributions f, g, h. We wish to test the hypothesis that f = g = h 
with the alternative that f > g,f > h, orsay,f >g> Ah. 

To test such a hypothesis with a sample of / 2’s, m y’s, and n 2’s, we arrange 
the sample values in ascending order and let U count the number of times 
a y precedes an x, and V count the number of times a z precedes an z. As a 
critical region for the hypothesis with the alternative f > g, f > h we propose 
touse U <= K,,V S K,;or withthe alternativeg >f>h, U 2K;3,V S Ky, 
where the constants AK; are chosen to give the correct significance level. Even 
if the significance level is fixed the constants K; are not uniquely determined. 
A reasonable principle to follow in this case would be to choose 


P(U S K:) = P(V © Kk.) or P(U2 K;) = P(V S Ky) 


according to which alternative is chosen. In particular, if m = n this leads to 


Kk, = Keand K;+ Ky, = m-n. 


3. Moments of the joint distribution of U and V. We consider sequences of 
La’s, m y’s, n 2’s and let Timn(U, V) be the number of such sequences in which 
a y precedes an x U times and a z precedes an x V times. Omitting the last term 
in such a sequence leads to the relation 


(1 Tuk, ¥ T 1-1.mn(U — m, V — n) + Tim-1.n(U, V) + Tim. 


1 The U statistic was introduced by H. B. Mann and the author in ‘‘On a test of whether 
one of two random variables is stochastically larger than the other,’’ Annals of Math. Stat., 
Vol. 18 (1947), pp. 50-60. The present extension was carried out at the suggestion of J. W. 
Tukey, Princeton Universit, 





EXTENSION OF U STATISTIC 


where Tinn(U, V) = Oif U<0;V <0;U >0,m = 0;0r V > 0,n = 0; and 


m+n 
Tom "ea . 
m 


Under the hypothesis any of the (J + m + n)!/l!m!n! sequences has equal 
probability. Hence 


ery l : : 
. Pimn(l a )= it+mt+n Pr—i,ma( Ul m, J n) 
2) 


= Pim—1,0(U, V) + - 


+ —____ —————-—=—=-_ is, w_1(U, V), 
l+m+n itm+sa* t 


where Pima(U, V) denotes the probability of a sequence of | x’s, m 1's, n 2’s 
having y precede an x U times and z precede an x V times. 

To obtain the mean of U we multiply (2) by U and sum over all U, V. This 
gives 


: ‘ l : m 7 P 
st eT eae eT 


n . lm 
7. eee Eim.n—1(U) + . 
l+m+n l+m+n 
This and a similar equation for Eimn(V), together with the obvious initial con- 
ditions, give 
_lm In 


(4) Eimn(U) >? Eimn(V) = 


= 


The recurrence relations for the higher moments about the mean are obtained 
by multiplying (2) by (U — 4lm)'(V — 4ln)’ and summing overall U, V. Using 
u= U — 3lm,v = V — 3ln, 


Boalt) = FH EB le) (5)(3) GY Banc 
Simn\U 1 l + m + N a=0 p=0 \@ 3 9 2 Vi-L.mn v) 
m . 1 i l i-a ; . 
att (5,%2 
lL+m+n a (:) (—1 ) Eim—1,n(U v’) 


n pt ok Teed +4 i E (uiy’) 
l+m + njoo \8 9 Gim,a~W(u v ). 


For i + j S 4 the solutions of (5) satisfying the initial conditions E,,,.(u'v’) = 
Eo(u'v’?) = O are found to be 


Bimn(u’) 5 Im(l + m + 1), 


; l 
Eimn(w) = Tp mn, 


Eimn(uv) Eimn(ur’) = 0, 





D. R. WHITNEY 
‘ l 
Eimn(u‘) = DA0 lm 
-(L + m + 1)(50'm + 5lm? — 20 — 2m? + 3lm — 


7 ,. 3 l ~ 72 » 2 2 9 
Einn(uv) = 40 lmn(5lm + 5lm° — 20° — 2m° + 3lm — 21 — 


22 | 
Einn(uv) = 7 Imn 


(50° + 5m + 5l’n + 15lmn + 147 
4 3lm + 3ln — 6mn + 71 — 2m — 2n — 2). 
From symmetry considerations it follows that Eim,(u'v’) = 0 if ¢ + 7 is odd. 
4. Limit distribution of u and v. Let F(l, m, n) be a function of integers 
l, m, n and define an operator V by 
VF(l, m,n) = U[F(l, m,n) — F(l — 1, m, n)} 
+ m[F(l, m,n) — F(l,m — 1, n)] + n[F(l, m,n) — F(l, m,n— 1)}. 


This permits (5) to be rewritten as 


i j fs . i—a 7-8 
VEima(u'r’) = 120 (: ) (3) (3) (5) Er-amn(ur’) 
a-0 pmo \@2/ \B/ \2 2 


atB<it+tj 


‘1 . fies l t-—a . s 
+ m X, (3) (—1) (5) Ei,m—1,n(u"v’) 


a 


j—l j val l 7-8 M 
+n 2 (2) (—1)"" (5) Ewmn-1(u'v’). 
S= 


In order to work with equation (8) we need these properties: 


(a) If WF(l, m, n) is a polynomial of degree ¢ in all the variables, «@ in 1, 8 in 
m, y in n, then F (1, m, n) is a polynomial of degree ¢ in all the variables, 
einl, Binm,y inn. 

(b) If P: , Qe are polynomials of degree ¢and 1, m,n — ~ so that F(l, m,n) — 
Fy and ee c, then Pr int 2 

Q: Q: t 

Leaving the proof of these statements to a later section (Section 5), we shall 
apply them now to equation (8). Since Eimn(u'v’) = 0 for i + j odd we consider 
only the case i + j = 2r. For r = 1, 2, Eimn(u'v’) is a polynomial of degree 3r, 
of degree at most 2r in J, 7 in m, andj in n. If we assume this to hold fori + 7 < 
2r, then from (8) VEimn(u'v’), i +7 = 2r, has these properties and hence E jmn(u'v’) 
does also. 

In what follows there are two cases according as 7 and j are both even or both 





EXTENSION OF U STATISTIC 277 


odd. We give only the first case explicitly. Replacing i and j in (8) by 27 and 
27, we obtain 


9; 9 ), 2 are 
VE mn (u'*v’’) - 1 { as ») (") Ev-t.ma(u" **”) 
21 2] m n . 2-1 2j-1 
+ Mai J) manor 
2j oe 5 — 
+ ee ») (7) Evtjma(u'v’ | 
2t l , 24—2_ 2; 
+ MI \5;_ 9) \5 E,jm—in(u v’) 
2j l\? diieele 
=* (., a ») (5) Eime—i(u'v’ | + Psisjy-1 (l,m,n), 


where P3¢i4.-1(l, m, n) indicates a polynomial of degree 3(¢ + 7) — lL inl, m, n, 
which is also of degree at most 2(7 + 7) in 1, 27 in m, and 2) in n. This may be 
reduced to 


VE imn(u'v?) = Alm(l + m)i(2i — 1) Eimn(u**v”’) 
(10) + Imn-i-jEimn(u ve) + An(l + n)j(27 — 1) Eimn(u'v’**) 


+ Psarp-a(l, m, n). 
Eimn(u*v®) : 


_ a 8 ’ 
Tur 


° . as 
Now we write Aina = 


then dividing (10) by o%! os? = [ps lm(l + m + 1)]' [Ye In(l + n + LY gives 
WE inal?) _ Hil + mili — 1) av-asy 
coe 7 ~ pglm(l + m+ 1) Nimn 
2d J Ts MM 295 | la eal 
+ YoVimd + m+ Dind+n+1) >" 
tln(l + n)j(27 — 1) .o40;-2 , Paisn-ll, m, n) 


7 l 7" 2i 2 
psln(l + n + 1) or ee 


Emn(u, v) _ | ; mn us at 
Cute Vi+m+)Dd+n¢)D 


and use p(l, m, n) — po to mean 1, m,n — ~ in such a way that 


p(l, m,n) = 


-————— dace 


mn 


Nid+m+)lDd¢+n+) ™ 


We then have for p(l, m, n) > po 


0 3 . 22 a 2 
(12) Nimn — Po, Atm.n —> 3; Amn —> 3p , Atmn — 1 + 2p. 





278 D. R. WHITNEY 


Dropping the /, m, n to denote the limiting values, (12) are just special cases 
le.,2 +7 = 2 or 4, of 


262 r (27) (23)! min(i,j) (2po)** 
20 avo (t— a) (7 —- a) !(2a)!? 


as sas min(i,j) f< 2a 
yt hes se (2i + 1)!(27 + i >: (20) 
2 ao (t—a)!(j — a)!(2Qa +1)! 
° > 5 ed 3 ° # . 
Inductively then, we assume for a + 8 < 2(i + j) that Ath, satisfies (13) 
for p(l, m,n) — po. Since P3¢54.)-1(l, m, n) in (11) has degree at most 2(¢ + J) 
in 1, 27 in m, 27 in n, we obtain 
VE inn (u2iv2') 


> 


t 


Lim 9% 
po (lm.n)—¢ Oy,0 


. (22 So) ie ). = 7 ; 2po) 
= 33(2: — 1) ; ~ . . 
— Gi? —a)!(2a)! 
(2i— 1)!(27—1)! 


Di+7-2 


(20)"" ab 
—a)!(j —1—a)!(Qa +1)! 


+ 127-jpo 


(27) !(27 — (2 o) 
+ 37(27 — 1) a = 
‘ . 9i+)-1 


and this reduces to 


fe . WE imn(u?'v’) 
(15 Lin - 2 2 
wi himaioee oy oy 


4 


From this it follows that 


, (5,-° 2 V9 24\0™ 
> 4 \ Zu) ilZ7)! 
(16) Lim = I : 2 1/9,\1? 
p(l.m,n)—>s Or Dit amd (2 — @)!(7 — a)!(2a)! 


° > > 2i+1 2741 ree » . ‘ . 
and in like manner for Eyn,(w” v” Therefore the moments of the limit 


distribution are those of a bivariate normal distribution. Hence the variables 


—.. 


9 9 


1 : Pace. coat 
+s lm(l + m + 1) NV Pp Intl + n+ 1) 


- in 


have in the limit a joint normal distribution with means 0, variances 1, and 
correlation coefficient py , where 


. mn 
po = Lim / 


Fs Es a ok 
imnve YW lL+-m+-bDl+n4+ 1) 


5. Properties of V. 
Lemma 1. Jf 


r “ r 
F(z, y, z) = 2 is > Aint’ y’ 2, 


i=0 j= kom 





EXTENSION OF U STATISTIC 


A be ’ 


WF(z,y,2) = DDD Ari y’ 2, 


t=O j= k=l 
where 


» yu 
A ijk = x (—1)*" Qajk (, ™ :) + a (—1)""! aus ( s :) 
- k 7 
+ x (—1)"™ diy (, ‘ad .): 


This follows from a straightforward application of the definition of ¥. 

Lemma 2. If F(x, y, z) is a polynomial in x, y, z of degree o, of degree d in x, 
pin y, v inz, then so is VF. 

This follows from the representation of VF in Lemma 1. 

Lemma 3. For any polynomial F(x, y, z) there exists a polynomial G(x; y, z) 
such that VG = F. 

Let the coefficients of F be denoted by A;,;, and the unknown coefficients of 
G by a;; . The lemma will follow if we can solve the equations in Lemma 1 for 
Ginx ,t = 0,1,--- ,As7 = 0,1,--- ,w3k =0,1,---,v. Fort +7 + ka maxi- 
mum for all the 7, 7, k of A;j ,we have 


} k ; ‘ 
A jin = ijn (, 4 +) + ijz ( * :) + ijz (, ie a = (i+jt kan. 


By induction we assume that the equation can be solved for the a; for all 
i, j, k such that 7 + j + k > ¢#. Then fori + 7 + k = twe have Aju, = 
(i + j + k)a:% plus a’s whose subscripts add to more than ¢. Hence the aj 
can be determined. 


Lemma 4. Jf ¥[F (zx, y, z) — G(a, y, z)] = 0, then F(x, y, z) — F(O, 0, 0) = 
G(x, y, z) — G(O, 0, 0). 

Let i = x + y + z. The lemma is true for t = 0, and we assume it to be true 
for all x, y, zsuch that z + y +2 <t. Then forz + y+ 2 =t, 


V[F(z, y, 2) — G(z, y, z)] = 0 
gives 
(x + y + z)[F(z, y, z) — G(z, y, z)| — 2[F(x — 1, y, 2) — G(x — 1, y, 2)] 
— y[F(2,y — 1,2) — Gla, y — 1,2z)] — Z[F(z, y,2z — 1) — Gz, y, 2 — 1)] =0. 

Using our induction assumption, 

(2+ y+ z)[F(z, y, z) — Gz, y,2z)]) = @ + y + z)[F(O, 0,0) — GOO, 0, 0)], 
and the lemma follows. 

6. Distribution of u and v in a particular case with / = 6,m = n = 3 Using. 


the relation (1), the table of 7's3:(U, V) (Table 1) was obtained. In this case 
E(U) = E(V) = 9, o% = of = 15, ow = 4.5, p = 0.3. 





TABLE 1 
Tea(U, V 


155 178 ; 17S 155 150 108 
160 194 173 144 139 95 46 ¢ 

158 160 144 122 110 74 52 38 2 
146 158 139 110 103 64 33 2 
108 112 95 64 $3 31 22 13 
85 91 71 5: 49 31 

66 65 33 «22 

42 4] 21 13 

20 1S 10 5 

16 17 


Ooo 
Fr 


o 
Sas 
ww we wT 


Oe Nw RE 
im OO ON 
wm Ort oh Oo 
— em hw me ON 


_ 
Ft 


TABLE 2 


k h 
DD Dy poa(U, V) 
V=0 U=—0 


62 96 137 191 242 298 
(91) (182) (181) (287) (295) 


86 120 166 210 256 
(80) (114) (156) (249) 


73 102 139 4 210 
(95) (128) d (201) 
84 113 166 
(76) (101) (156) 


63 84 120 
(57) = (76) (114) 


46 61 86 
(41) «(4 ) (80) 


32 41 
(28) 


19 
(18) 


10 
(11) 








EXTENSION OF U STATISTIC 281 


Table 2 gives the cumulative distribution )-}.0 >-t=0 pes(U, V). The num- 
bers have all been multiplied by 1000. The figures in parentheses are the values 
obtained by using (U — 9 — 4)/+/15, (V — 9 — 4)/+/15 as random variables 
from a bivariate normal distribution of means zero, variances one, and cor- 
relation coefficient 0.3. 


7. Example. Suppose that y, x, z denote the lengths of life of rats that have 
been exposed to insecticides of supposedly decreasing toxicity. We would then 
be interested in the hypothesis g = f = h under the alternative g > f > h. 

For a sample of 3 y’s, 6 z’s, and 3 2’s, a critical region of size .044 is found 
from the preceding table to be U = 12, V S 6. In an experiment the sequence 


TABLE 2—Continued 


k h 
> Dd pea(U, V) 
V=0 U=0 


400 440 478 502 520 533 541 544 548 
(404) (447) (482) (508) (526) (537) (544) (548) (551) 


338 369 398 418 431 441 447 450 452 
(336) (370) (398) (417) (430) (439) (444) (446) (449) 


272 29 318 332 342 349 353 355 357 
(268) 29% (313) (827) (337) (345) (346) (848) (849) 


213 23 246 256 263 268 271 272 274 
(204) (236) (245) (252) (255) (258) (259) (260) 


151 \ 173 179 184 187 189 190 190 
(147) 59) (168) (174) (178) (180) (182) (183) (183) 


106 ‘ 119 124 127 129 130 130 131 
(101) 3 (118) (120) (121) (122) (123) (128) 


68 7 79 81 32 83 83 83 
(65) ¢ (75) (76) (78) (78) (78) 


39 45 46 7 47 47 48 
(40) (45) (46) 7 (47) (47) (47) 


20 ‘ , 23 23 . 24 24 24 
(23) 29) (26) (26) (26) (26) (27) (27) 


10 il 12 12 12 12 12 
(13) d (14) (14) (14) (14) (14) (14) 


13 14 15 16 17 18 





282 D. R. WHITNEY 


yyxxxyx2xz2x was obtained. For this sample U = 15, V = 4, and consequently 
we presume the toxic effects to be as supposed. 
For a sample of 7 y’s, 6 z’s, 8 2’s, we first compute 


E(U) = 21, E(V) = 28, ov =49, oF = 60, 
The critical region can be written as 


r _ BYTT) >. eetes 
eae eee 
Cv Oy 


where h, k are to be determined to give a significance level of 5%, say, and 


subject to 
p(t a) h) “ p(' <e) g -k). 
ou ; Ov 


With the normal approximation to the distribution of U or V the last condition 
implies h = k. Then entering Pearson’s table for the normal bivariate distribu- 
tion with p = —.52 (interpolating between —.50 and —.55) we find that 
h = k = .37 are the desired values. This gives a 5% critical region of U = 24, 


Y oD. 





RELATIONS BETWEEN VARIOUSLY DEFINED EFFECTS AND 
INTERACTIONS IN ANALYSIS OF VARIANCE 


By S. Vaspa 


London, England 


1. Summary. From an algebraic point of view the analysis of variance tests 
of effects and interactions can be based on the minimum values of a certain 
quadratic expression in which the “h-matrix’’ (defined in Section 3) is funda- 
mental. The arbitrariness in the choice of this matrix reflects the arbitrariness 
in the definition of effects and interactions. The paper considers the dependence 
of the result of these tests on the h-matrix used and expresses the answer by the 
two theorems of Section 4, which are proved in the subsequent sections. 


2. Introduction. The sums of squares which appear in an analysis of variance 
when the significance of effects and/or interactions is tested can be obtained 
by taking the minima with regard to values a;,...., of such expressions as 


ni ne 
(1) + ae a Giy---+,1Ys, i a ye’ > a, k, he, k, (t1° . -i,)]? ’ 
i;=1 i=l ky ks 
where the y;,...;, are the means of g;,...;, observed values for levels 7, of variables 
x, (t = 1,---, 8), respectively, and the values h form a nonsingular matrix 
which will be described in detail in the next section. The summation inside the 
bracket in (1) is carried out over sets (k; --- k,) of subseripts,0 < k, < m — 1, 
which depend on the aggregate of effects and interactions to be tested. If all 
m, ++: nN, possible sets appeared in (1), the minimum would, of course, be zero. 
To each test there belongs a set of k’s which is left out of the combinations of 
subscripts in (1) according to the following rule: 

The interaction of order (¢ — 1) between 2, --- , x, is tested by omitting all 
k,..-k,o-.-0 for which k, --- k, ¥ 0. (A main effect is equivalent to an interaction 
of order zero.) An aggregate of interactions is tested by leaving out all com- 
binations referring to any of its several components [2]. 

As an illustration, let us take s = 2 and g;,...;,= 1. We choose the following 
(orthogonal) matrix of hy ,x,(t:%2): 


(i; %) = . 31 
(ky ke) = (1 1 1 
1 





284 S. VAJDA 


If we test for the first main effect, then we retain do , @1, @u, and ay, and 
we obtain for the minimum, after straightforward calculations, 
2 2 2 2 2 
(yur + yro) + (yar + yee) + Gn + yor)” _ (SE Yarie) 
2 6 
Similarly, testing for the interaction, we would retain doo , Goi , @ip , G0 , and 
obtain 
a - 2 \2 
ZZ Yirig — 31QYu + Yo) + (Yor + Yoo) + (Yar + Yse)"] 


—3{(yn + Ya + ya)” + (Yr + Yo + (ys2)"] + (22 Yirig) 6. 


If we had taken general weights g;,...;,, but still using the same values for 
the h-matrix, then we should have obtained results which are equivalent to those 
given by Yates’ “method of weighted squares of means”’ [5]. 


3. Assumptions and definitions. Let the ‘‘h-matrix”’ hy,...£,(¢,...0,) @ = 1l,---, 
ni 3k, = 0,1, --- , ne — 1) be such that all the elements in the same row have 
equal sets of subscripts and all the elements in the same column have equal sets 
of arguments. It will be assumed that this matrix satisfies the following conditions: 

ConpITIon A. 


nmi he 

2 hii - Wi, ---t, he,..-e, (ts 4 i) Passa on, (4a eee ) co 0 

ij=1 i=l 
if simultaneously k, = m,, and = O otherwise. The w;,...:, are positive weights. 
It follows that the A-matrix is not singular. 

ConpiTIon B. If any k; = 0, then he,...%,...2, 18 independent of 2; . 

In particular, if the h’s are orthogonal polynomials of degrees k, in 7, , then 
Conditions A and B hold by definition. 

It has been shown [1] that these two conditions can be satisfied simultaneously 
only if the weights are “proportionate,” i.e., if w;,....,/) 7a Wy,...:, is independent 
of all 7, (m = ¢) for all ¢. 

From Condition A can be derived the following lemma, which will be used 
at a later state: 

Lemma. If k, ¥ 0, then DoP44 wi... hey-.a,(ir-*- i) = 0. 

Proor. We assume that ¢ = 1; this clearly does not restrict the generality of 
the argument, since it may be repeated identically for any other value of tf. 
From Condition A we have the equations: 


mi 


Z cee 7 Wi, .-d, hey... (ts -+* 4) Rens: m, (11 pee 


+,;=1 t,=1 


for all m.,--- , m,, since k, is assumed to be different from zero. If we regard 
the n.--- n, expressions )o?y w;,...i, he,-e,(t1 «+> t,) for all ig, +++, ¢, as un- 


known values, we have the same number of linear homogeneous equations for 


them. The determinant of the system is orthogonal and hence not zero. It follows 
that the unknown values must be zero, and thus the lemma is proved. 





ANALYSIS OF VARIANCE 


All sets (ky «++ k,O--- 0) with k, --- ky ¥ 0 form a “block,” which we denote 
by ((Ay --- hy)). The meaning of ((k», +++ km,)) is immediately obvious. Every 
set of subseripts belongs to one and only one block. If we consider a particular 
block and then omit one or more values from within the double brackets de- 
noting it, another block is obtained, which we call a “sub-block”’ of the former. 


4. The problem. Even with Conditions A and B to be satisfied, there remains 
still an arbitrariness in the choice of the h-matrix, and this reflects an arbitrariness 
in the definition of interactions [3]. If the h’s are such that Condition A is satisfied 
With Wj,...:8 = Gi,-i, for all 7, then the computation of the minimum of (1) 
becomes very simple, but clearly this cannot be a reason for choosing the h-matrix 
accordingly |4]. However, in this paper we shall be concerned with another 
aspect of the situation: we wish to find out whether two different A-matrices 
can lead to the same minimum value and, if so, under what conditions. The 
answer depends on the particular test carried out and is expressed by the follow- 
ing two theorems. 

THEOREM 1. Jf two h-matrices satisfy Condition A with regard to the same weights, 
then both lead to the same minimum of (1), whatever the aggregate of interactions 
tested. 

THroreM 2. If the aggregate is such that for each retained block of subscripts all 
its sub-blocks are also retained, then the minimum of (1) is independent of the 
h-matrix (even if the latter is not orthogonal with regard to any weights). 

It follows from the latter theorem that when only the highest order interaction 
(that of order s — 1) is tested, and hence all sets of subscripts except those con- 
stituting the block ((k; --- k,)) are retained, any h-matrix leads to the same con- 
clusion. 


5. Transformation of the problem. In what follows we shall denote, where 
no misunderstanding can arise, the various sets (7, --- 7.) by 1, +-- , Iw, and 
the sets (k, --- k,) by Ki,---, Ku. Here N = ny --- n,, and M is the num- 
ber of retained sets of subscripts, e.g., M = (m, — 1) --- (nm: — 1) if only the 
block ((k; --- k.)) were retained. We have N > M, except in the trivial case 
where the minimum of (1) is zero. 

Let us now imagine that we have two h-matrices, the elements of which are 
denoted by h and h’ respectively. If for any given set of ax,(i = 1,--: , M) we 
can find a set ax, so that 


M M 
(2) D ax, he) = Dd ag, he, (1) 
i=] 


t=] 


for all J,(t = 1,--- , N), then clearly the set of values which (1) can assume is 
identical with that of a similar expression when h is replaced by h’. Hence the 
minima of the two expressions will also be the same. 

It follows that different h-matrices will lead to the same minimum of (1) if 
they and the retained blocks of subscripts are such that (2) can be solved for the 
ax, , assuming that the ax, are given. In (2) there are N equations for the M 





286 S. VAJDA 


unknowns ax, . It will be possible, therefore, to solve the set only if not more 
than M of the equations are linearly independent. 
Regarding, to begin with, only the left-hand side (l.h.s.) of (2), we can cer- 
tainly select MW sets of arguments J,,---, Jy so that the determinant 
hx,;(1:) | # 0 (since the complete A-matrix is not singular). Hence for any 
further argument J, say, we can solve the system of linear equations 


M 
= >, Cy, ha (ld, 


M 
= >. C1, hey (ls). 
t=1 
This gives hx,;(J) as a linear combination of hx,(/1), «+: , he;(I) which is the 
same for alli = 1,---, M. Therefore the |.h.s. of those equations in (2) in 
which J, = J will again be the same linear combination of the I.h.s. of the equa- 
tions in which the arguments are J, , --- , J, respectively. Consequently the 
whole equation, written for J, will be the very same linear combination of the 
equations for the 7, severally, if it can be shown that the C;, which we find 
from (3) are equally applicable to the r.h.s. of (2), i.e., to the hg, . Since the 
two matrices are, by the assumptions of Theorem 1, orthogonal with regard to 
the same weights, it is sufficient to prove that the C;, , which are the solutions 
of (3), although possibly dependent on the weights w,,...;,, do not otherwise 
depend on the h-matrix considered. 


6. Proof of Theorem 1. In general, the sets of subscripts K,,---, Ay will 
not all belong to the same block. We consider first the block out of which K, 
is taken and assume that it consists of K,,--- , Kp (P < M). It is no restric- 
tion of generality to assume further that this is the block ((k; --- &,.)), so that 
P = (mn — 1)--- (mn — 1). 

We fix our attention on one single set belonging to this block, say 
(ki, -:: , km, 0,---, 0). Conditions A and B imply linear relations between 
the hi,...i,0--.0(t1 +++ t.), and we shall now establish how many of these values 
can be chosen independently, thereby fixing all others implicitly. If hi,...z,,0...0 
is known for (7; , --- , 7) where the 7; are some fixed values, then, by virtue of 
Condition B, it is also known for all (%),-+-, im, imai, +++, ts) Where 
the imsi,***, % are arbitrary. We need therefore only investigate relations 
between the mn --- Mm values hj,...i,0...0(t1 *** Umtmsi°** ts). These are not all 
independent either, since our lemma gives, for r = 1, --- , m, 


(4) 2 Wien, Riky.nkgjo-.0(th °° 


tr=l 


Sor Oil Se. *** » Sea Sean *** ye 


’ 


Thus only (nm; — 1) --- (mm — 1) = P values among the hj, 


teat - 1.) 
will be independent, and it is easy to indicate how such a set can be found. In 


(a, °- 





ANALYSIS OF VARIANCE 287 


the matrix || hx;(/:) || (j = 1,--:, P;t = 1,---, M) there must be a square 
sub-matrix of order P which is not singular, since otherwise the determinant of 
(3) would be zero, contrary to our assumptions. Let this sub-matrix be 
| Ax; (lu) || (u = th, --- , tp 3) as before). Then it follows that the Aj,...i,,0...0(7u) 
constitute a set of values which can arbitrarily be selected. Indeed, if they were 
dependent, by virtue of (4) and Condition B, then identical linear relations 
would hold for all K;, i.e., for all rows of the matrix || hx;(Z.) || , which would 
hence be singular. 

We may, then, rewrite the first P equations in (3) by expressing all A(/,) on 
the r.h.s. in terms of J;,,--- , J+, a8 arguments. The coefficients will be linear 
combinations of C;, and of the weights, which appear in (4). Since the I.h.s., 
i.e., h(J), with subscripts of h as before, can also be expressed as a linear com- 
bination of these same A(J/,), by virtue of (4) and Condition B, we see that the 
C;, must satisfy P identities, in which only the weights w,,...;, are parameters. 

All blocks to which the K, , --- , Ky belong can be treated in the same way 
and thus we obtain altogether M equations from which the C,,(t¢ = 1, --- , M) 
can be obtained. They will depend on the weights, but not otherwise on the h 
matrix. This completes the proof of Theorem 1. 


7. Proof of Theorem 2. We turn now to Theorem 2, and assume that the 
sets of subscripts retained in (1) are those of the blocks By, B,,--- , B, and 
of all their sub-blocks. We can at once indicate the sets of arguments J, , J2, - 
equal in number to the retained sets of subscripts, and such that h(j; --- j,) can 
be expressed as a linear combination of h(J;), h(J2), --- for all sets of subscripts 
considered. For this purpose we take, for each retained set (k, --- k,), the set 
of arguments (k; + 1, --- ,k, + 1). Thus there will be the same number (=) 
of sets of arguments as there are sets of retained subscripts. We note in par- 
ticular, that if any of the k, = 0, the corresponding j, will be unity. This will 
be the case in respect of all sets of a block, if it is true for any set in it. 

To simplify our formulae, we introduce the following notation: If (J) = 
(ji: --* je), then (J); is the result of replacing by unity all those 7, which corre- 
spond to a k, = 0 in block B; . Further, (J);; is the result of replacing by unity 
all those j, which correspond to a k, = 0 either in B; or in B;, or, in other words, 
those 7; which do not correspond to the largest common sub-block of B; and 
B; . The notation (./);;...x is similarly defined. Now if K; is any set of subscripts 
in the block B; , then it follows from Condition B that 


he(ja +++ je) = ha (jre+* jedi; 
and, more generally, 


Wy(ji +++ Je) jeecm = Aeg(jr ++ Jedije-k 


since all those additional arguments 1 in (j; «++ js);;...« Which do not already 
appear in (j; --~ j.);...e correspond to zeros in the subscripts of B;. Moreover, 
these relations remain true if we take, instead of B; , any of its sub-blocks, since 
such a sub-block contains all those zeros which were in B; (and some more) 





288 S. VAJDA 
We shall now prove that the relation 


(5) hxe(J) = > he(J)i — = heWJ)i5 +++ + (— DD heW)an...n, 


t=—/) tye j=0 


(J) being an arbitrary set of arguments, holds for all K out of Bo, Bi, --- , Ba 
and also out of any of their sub-blocks. This is a linear relation of the type 
which we need for the proof of Theorem 2 and we see that all sets of argu- 
ments appearing on the r.h.s. are among those J, , J: , --- which we have ini- 
tially selected as a basis. Hence, if we prove that this relation holds for any K 
out of the blocks and sub-blocks considered, then Theorem 2 follows. 

First let K be a set out of By. Equation (5) can be written as follows: 


he(J) = he(J)o + Dc hes — DoheWor — Do hea 


t=] i=1 ty%j—1 


n 


+ 7. he(J ois + °° + (—D" hear... 
tj=1 

Now we have hx(J) = hx(J)). Moreover, the second term on the r.h.s. cancels 
with the third, the fourth with the fifth, and so on until all terms are exhausted. 
This proves relation (5) for K out of By. But it is evident that the proof could 
equally well have been carried out for any other of the given blocks or for any 
of their sub-blocks. This completes the proof of Theorem 2. It will be noticed 
that no weights appear in (5), so that under the given conditions the theorem 
holds even for matrices which are not orthogonal (in the sense of Condition A) 
with regard to any set of weights. 


REFERENCES 
| S. Vaspa, “‘A note on the use of weighted orthogonal functions in statistical analysis,”’ 

Proc. Cambridge Philos. Soc., Vol. 44 (1948), p. 588. 

s. Vaspa, ‘“‘An outline of the theory of the analysis of variance,”’ Jour. Inst. Actuaries 
Students’ Soc., Vol. 7 (1948), p. 235. 
J. Finney, ‘‘Main effects and interactions,’’ Jour. Am. Stat. Assn., Vol. 43 (1948), 
p. 566. 

5. Vaspa, ‘““Technique of the analysis of variance,’’ Nature, Vol. 160 (1947), p. 27. 

’. Yares, ‘‘The analysis of multiple classifications with unequal numbers in the different 
classes,’’ Jour. Am, Stat. Assn., Vol. 29 (1934), p. 51 





ON UNIFORMLY CONSISTENT TESTS 
By AGcnes BERGER 


New York City 


1. Introduction. If we wish to decide on the true distribution of a raridom 
variable known to be distributed according to one or the other of two given distri- 
butions Fy and F, , then, no matter how small a bound is given in advance, it 
is always possible to devise a test based on a sufficiently large number of inde- 
pendent observations for which the probabilities of erroneous decisions are 
smaller than the previously assigned bound. A sequence of tests for which the 
corresponding probabilities of errors tend to zero has been called consistent [1].’ 

Let us suppose now that all we know about the true distribution of some ran- 
dom variable is that it belongs to one of two given families of distributions 
and it is desired to decide which of the two it belongs to; i.e., we have to test a 
composite hypothesis. It may again be possible to construct a sequence of tests 
{T;},j7 = 1, 2, --+ , such that for any « > 0 there exists an index N such that 
for 7 > N the probabilities of errors corresponding to 7; are smaller than e. 
The sequence {7';} may then be called uniformly consistent. Conditions under 
which uniformly consistent tests exist have been given by von Mises [5], and 
by Wald [2], [3], [4], as implied, for example, by his proof of the uniform con- 
sistency of the likelihood ratio test. In this paper a different set of conditions is 
given which do not restrict in any way the nature of the distribution functions 
considered. It is also shown that the conditions to be described are satisfied in a 
large class of cases occurring in practical statistics. 

Since the results we are to prove have their counterpart in abstract measure 
theory we shall take advantage of that method. The reader will have no difficulty 
in establishing the correspondence between the statistical and measure theoretical 
formulation. 

Notations. Let X be an arbitrary set and % a Borel field of subsets B of X. 
Let 91%(%) be the family of all probability measures m(B) defined on &, i.e., the 
family of all countably additive nonnegative set functions defined on % for 
which m(X) = 1. Hereafter a “measure” will denote an element of 97(%) and 
a “set of measures” a subset of 92(%). For any positive integer k, let X* be 
the kth direct product of X by itself, B* the kth direct product of B by itself, 
&* the field consisting of all finite sums of sets of 6‘ and B* the smallest o-field 
containing &*. For any measure m on %, we define m* in the usual way as the 
unique measure defined on 8“ for which 


l l 
m* (= Bi: Bye ++ Bu) = » m(Bix)m(Bi2) pee m( Bix) 


for any disjoint system B;;¢ 8, i = 1) 2,---,1; 7 = 1,2,--+, k, where l is 
an arbitrary positive integer. 


1 This is a slightly modified form of the definition in [1]. 
289 





290 AGNES BERGER 


2. A known lemma. The main result will be established by a generalization 
of the following well known lemma: 

Lemma 1 (BerNnouuii). Let M = {m} and M’ = |m’} be two disjoint sets 
of measures. If there exists a set A in B and aé > O such that 


m(A) — m’(A) | > 26 


for all m in M and all m’ in M’, then, for any « > 0 given in advance, there exists 
an integer k and a set E in ©“ such that 


m‘(E) < for ail m in M 
and 
m'(Z) >1—e for all m' in M’. 
This is an almost immediate consequence of Bernoulli’s theorem, but for 
sake of completeness I include a proof. 
Proor. For any two integers n and r, 0 < r < n, let R(n, r) be the union 


of all regions in X” defined by restricting r of the first n coordinates to A, and 
the remaining n — r to (X — A). For any fixed n, let for any measure u 


U R(n, r). 


Foli?.. ‘ 
iI In (A) =} 


(Here {¢| 7’} means the set of all ¢#’s which satisfy relation 7, as customary.) 
Let «€ > 0 be given. By Bernoulli’s theorem, there exists an integer n(¢) such 
that 


(SQ) = y (”) (A) [1 — uA <e 


{r| | 5—#¢49| =} 


Let E = Onew Sz. Since for any fixed n there are only a finite number of differ- 
ent S?, E is in ©” and 
m"(E) < «€ 
for all m in M. Since for any m in M and m’ in M’, 
i? - r 
-_— m(A) | > 26 — |m’(A) —- . 
n | | n 
we have | (r/n) — m(A) | > 6 for all r satisfying | m’(A) — (r/n) | < 6 for 
some m’ in M’. Hence if x is in (X " — §%.) for some m’ in M’, then z isin / and 
m'"(E) > m’'"(X" — SR.) > 1l—e 
for all m’ in M’. 
In the special case M = m, M’ = m’, this proves the statement in the in- 
troduction concerning simple hypotheses. 


3. The main results. 
DerinitTion. Let M = |m} and M’ = {m’} be two disjoint sets of measures. 
We shall say that they satisfy “Condition 1” if the following holds: M is the wnion 





CONSISTENT TESTS 291 


of a finite number of its subsets M;, i = 1, 2,--- , k, such that for every i there 
exist 
(i) a covering of M’ by a finite number of its subsets M;;, 7 = 1, 2,---,hi, 
(ii) a@ sequence of sets Aj; in B,j = , 2,---,hi, and 
(iii) a6 > 0 such that | m(Ai;) — m’(Ai;) | > 6 for every m in M; and every 
m’ in Mi;,j =1,2,---,h&3%=1,2,- k. 

Condition 1 is satisfied for instance if both M rand’ contain only a finite 
number of measures. 

Lemma 2. Let M = {m} and M’ = {m’} be two disjoint sets of measures and 
assume that they satisfy Condition 1. Then for every « > O, there exist an integer 
n(e) and a set E in ©” such that m"(E) < « for allmin M and m’"(E) > 1 — « 
for all m’ in M’. 

Proor. Assume first that k = 1. Then M; = M, = M, and we can put Mj; = 
M;, hi = h, Aj; = A;. Condition 1 then states that | m(A;) — m’(A;)|> 6 
for every m in M and m’ in M’,j = 1, 2, --- , h. By Lemma 1, for any e > 0 
there exists n; and E; in ©™ such that m"(E;) < ¢/h for all m in M 
and m’"!(E;) > 1 — e¢/h for all m’ in Mj. Let n = max n; and 


A 
E = UE;-x"""), 


j=l 


Then FE is in ©” and 


h h 
m"(E) < > m"(Ej-X"-") = Do m"(Ej) <« 


j=l j=l 


. . . . , 
for every m in M, and if m’ is in any fixed M 
. J 3?) 


m'"(E) > m' (E) > 1 — "i > 
1 


so that 


m'"(E) >1—e 
for all m’ in M’. 
Now if k > 1, let us choose some @,0 < @ < 4, and apply the above argu- 
ment to each M; . We get m™‘(E,) < @for all m in M; and m’‘(E;) > 1 — @ for 
all m’ in M’. Hence 


m"(E;) — m'™*(E;)| > 1 — 2@ > 0, 


so that Condition 1 is satisfied with k = 1 and with the set {m’""""} taking 
the place of M and the set {m™*"”} taking the place of M’. It is easy to see 
that also in this case E still belongs to the field ©”. 

If we do not require that the set E in the conclusion of Lemma 2 belong to 
€" but only that it belong to 8", we can relax Condition 1 in the following 
obvious way: 

THEOREM 1. Jn order that two disjoint sets of measures M = {m} and M’ = 
{m’} be such that for every « > 0 there be an integer n and a set B in B" for which 





292 AGNES BERGER 
m"(B) < € for every min M and m'"(B) > 1 — ¢ for every m’ in M’, it is neces- 
sary and sufficient that for some integer v the sets {m"} and {m”} satisfy 
Condition 1. 

THEoreM 2. Let M = {me}, M’ = {m)} be two disjoint sets of measures, a < 
6 < b,a’ < + < Bb’, where [a, b] and [a’, b’| are two disjoint, closed intervals of 
some finite-dimensional Hyuclidean space, and assume that, for each B in 8, mo(B) 
and m}(B) are continuous functions of 0 and r, respectively. Then for any « > 0 
given in advance, there exist an integer n(e) and a set E in ©” such that1 ¢(E) < «¢ 
for all @ in [a, b] and m:"(E) > 1 — e for all r in {a’, b’). 

Proor. It is sufficient to prove that M and M’ satisfy Condition 1. For any 
6 in [a, b] and any + in [a’, b’|, let By, denote a set in 8 for which 


(1) | me( Ber) — m.(Be,)| > «, > 0. 


(This is obviously possible.) 
Let us now hold @ fixed. Because of the continuity of m, , for every 7 there 
exists a 53, > 0 such that whenever | 7 — 7+! < 4s, then 


‘ , ’ €dr 
(2) m;(Be,) — m,(Bér) | < 3° 
Since [a’, b’] is compact, it can be covered for each fixed @ by a finite subset of 
the open intervals I, = (—de + 7, tr + Se), say In, Toe, +++, Toney, with 
. . , 
midpoints te , Te, °** , Tea@). Denote the values of m,, he, , Be, €, , Se, at 
, . . 
Tt = 19; by mo; he; , Bo; , €0; , 59; , respectively, for 7 = 1, 2, --- , h(@). 
Since ms, is continuous in @ for all B, there exists a positive number pe such 
that whenever | 6 — @| < ps then simultaneously for 7 = 1, 2, --- , h(@) 


(3) | me(Bo;) — ms(Be;)| < 4 min 6; , 
J 


and since [a, b] is compact, it can be covered by a finite subset of the open in- 
tervals Lp = (—pe + 0,0+ pe), say Li = (—pit+ 0;,0:;+ pi),i = 1,2,---,k. 
Let us denote the values of 74; , Be; , h(O), €;, 5e;, Io; at 0 = 0; by 715, Bij, 
hi, €:3 , 5:; , Zi; , respectively. Then the sets M; = {me | @in L;},i = 1,2, --- ,k, 
cover M, and for each 7 the sets | Mi; = {m, |rin J;;},7 = 1,2, --- , hg, cover 
M’. Furthermore it follows from (1), (2), and (3) that as long as 6 is in M; and 
rT in Mi; ’ 
| m(Bi;) — m.(B,;)| > | m,(Bi;) — m,,;(Bis)| — | mo,(Bi;) — me(B;;)| 
— | m,(B;;) — m,,;(Bi;)| > Y= I min Gs— i €ij > 7” me 0, 
; 
t= 1,2,---,k; 
as we wanted to prove. 
In the statistical terminology of the introduction, Theorem 2 may be restated 


as follows: Let Hy be the hypothesis that the unknown distribution of some 
random variable belongs to a set of distributions M, and H, that it belongs to 





CONSISTENT TESTS 293 


another, ./’, where M and M’ satisfy the assumptions of Theorem 2. Then 
there exists a uniformly consistent sequence of tests for testing Hy against H; . 


4. An example. Let F(t) = (1/+/2x) e?*""", Hy = F(0), H, = {F(t),1<t<2} 
Let 


an Rapa icgt =.) ater AT mT Fey, as 


@=1,2,---s;sn=1,2 


9m *"" 4 


where c; is determined so that P(R,,;|6) = 1/1, P(S|t) denoting the prob- 
ability of the region S when ¢ is the true mean. Thus R,,,; is the uniformly most 
powerful region of size 1/7 in n-dimensional sample space for testing Hy against 
H, . The regions R,,; define a uniformly consistent test. A proof avoiding all 
computation is based on Theorem 2 as follows. Let « > 0 be given; find 7 such 
that 1/7 < e«. By Theorem 2, there exists an N and a Borel set B in the N- 
dimensional sample space such that P(B|0) < 1/7 and P(B|t) > 1 — 1/t 
for 1 < ¢t < 2. Let W bea region in N-dimensional sample space covering B and 
such that P(W|0) = 1/7. Then P(W|0) = P(R,,; | 0) = 1/t < «, and by 
the definition of Ry,; 


P(Rv, |) > P(W\t) > P(B\t) >1-1/i>1-~ 1<t<2. 


It is a pleasure to express my best thanks to Professor J. Wolfowitz for calling 
my attention to the problem and to Professors J. von Neumann and H. Scheffé 
for their valuable suggestions. 


REFERENCES 
{1] A. WALb anv J. WoLrow!17z, ‘On a test whether two samples are from the same popu- 
lation,’’ Annals of Math. Stat., Vol. 11 (1940), pp. 147-162. 
[2] A. Waxp, ‘‘Asymptotically most powerful tests of statistical hypotheses,’’ Annals of 
Math. Stat., Vol. 12 (1941), pp. 1-19. 
[3] A. Waxp, ‘Some examples of asymptotically most powerful tests,’? Annals of Math. 
Stat., Vol. 12 (1941), pp. 396-408. 
. Waxp, ‘Tests of statistical hypotheses concerning several parameters when the 
number of observations is large,’”’ Trans. Am. Math. Soc., Vol. 54 (1943), pp. 426-482. 
. von Miss, “‘On the problem of testing hypotheses,’’ Annals of Math. Stat., Vol. 14 
(1943), pp. 238-252. 





NONPARAMETRIC ESTIMATION IV 


By D. A. 8S. FrRasER AND R. WoRMLEIGHTON' 
University of Toronto 


1. Summary. In the three papers, [1], [2], [3], entitled ‘Nonparametric estima- 
tion’, Scheffé and Tukey generalized previous results on tolerance regions and 
extended them to cover all continuous and discontinuous distribution functions. 
This note contains four comments arising from these papers: first, on a method 
for giving bounds to the confidence level in the discontinuous case which can 
lower the probability that the end points need to have part, a random variable, 
of their probability neglected to maintain the given confidence level; second, 
on a correction of a statement of results in [2]; third, on a proof in [2] requiring 
a further statement; fourth, a necessary restatement of theorems in {3}. 

2. Bounds for the confidence level. In paper {1} Scheffé and Tukey extend 
the theory of tolerance regions to the one dimensional discontinuous case, and 
obtain the following statement: 


(2.1) " oso = B} 2 1 — ayy = PriCwp 40,0 2 8B}, 


where (,,)10,;,)70 are respectively the coverages of the open and closed inter- 
vals with end points the pth and gth order statistics z, and z,(q > p) of a 
sample of n from a distribution and a,_, is the incomplete Beta function 
In(q-—pn—-p+qt1). 

This statement implies the following statements: 


S pif’ > >} 
1 — Qq—p+2 = Pric p—1)+0,(@+I—0 = Pj; 


Pr{C w+ —0 ,(q—1) +0 - B} = ] a 


This suggests giving bounds for the confidence levels of the tolerance regions 
of statement (2.1). 

Let us consider the one dimensional representation theorem with its “inverse 
probability integral transformation’’. This transformation labelled gp(x*) mapped 
* with a uniform distribution into x with the given distribution represented by 
F(x). 2* and z refer to the corresponding order statistics. Take any interval on 
the range of the uniform distribution whose end points lie respectively in the 
closed intervals (zo-1, 2*) and (z* , za41). The confidence level, that the cover- 
age of this interval is at least 8, lies between 1 — a,_, and 1 — ay_p4». Apply 
the mapping gr(x*). The confidence level lies between 1 — a,_, and | 
that the following coverage is greater than or equal to 3: C(»)0,.¢)40, if Zp is 


distinct from z,-; and z, is distinct from 2,4: ; C¢ 9 + “fraction” of the 


coverage of z, if z, is distinct from z,_; and z, identical to 2,4; ; and similarly 


+ 


~ GQgq—p+2 


a+ 


for the other two possible cases. The “fraction” (a number between 0 and 1) can 


‘The authors wish to thank Professor John Tukey for suggesting Definitions 5.2 and 


5.3 


294 





NONPARAMETRIC ESTIMATION IV 295 


be considered as a random variable as determined by the above mapping or 
as a fixed value since the relation must be true for at least one fixed value for 
any given distribution and integers p and q. In either case it is unknown to the 
practitioner and the interpretation would be unimportant. 

Similarly we obtain the following result: The confidence level lies between 
1 — agp and 1 — ag_pir+, that the following coverage is greater than or equal 
to 8: Cw)-0,1¢) 40 , if zp is distinct from z,_, and 2, is distinct from 2,45 ; C(p)-0,.q)-0 
+ “fraction” of the coverage of z,, if zp is distinct from z,_, and z, is identical 
to 2,4. ; and similarly for the other two cases. 

The open interval can be treated in a similar manner. 

The application of these results would be for the practitioner who was familiar 
with the type of data he was to receive and realized that perhaps two or three 
order statistics would be tied on one or perhaps both tails. He would then choose 
r and s to give as tight control of the confidence: level consistent with a reason- 
able determinacy in the tolerance interval (the probability being small that the 
coverage should be considered as including only part instead of all of the cover- 
age of the end points). 

These results also generalize to the multivariate case with little alteration. 
For example consider the following result which would correspond to the closed 
interval case above. The confidence level lies between 1 — a,_p and 1 — ag_p++ 
that the following coverage is greater than or equal to 8: cov {B,}, if By is con- 
tained in B,,, where u consists of r of the integers (1, 2,---, + 1) which 
are not contained in A; cov {B,} + “fraction” of (cov {By} — cov {B,}), if 
B, is not contained in B,,,. Here the “fraction” can be considered a random 
variable or fixed, in either case unknown to the practitioner (A containing 
q — p integers). 

In formulating the above generalization, attention was drawn to the fact 
that the block groups did not form a proper sequence as \ was increased. By 
the following counter example the theorems in [3] are seen to be incorrect using 
the given definition of block groups. Rectifying definitions are presented in 
Section 5. 

Following the notation of {3}, let 


Gilr, Y) 
G2(X, y) 
o3(r, y) 


CailT, Y) 


and p; = i(i = 1, 2, 3, 4). 
Consider the distribution F(x, y) = e(x)e(y) where e(x) is defined by 


ez) = Q, 2 <. @. 
= j. Zz <2 WU. 


Take a sample of n 2 6 from this distribution; the sample values will all be 


(0, 0) with probability one 





A. S. FRASER AND R. WORMLEIGHTON 


={(a,y)|\y>O0ory =0,z> 9}, 
‘(0, 0)}, 
ia aiy <0.» 2 Qt, 
= Null set, 
S; = {(z,y|\zx<O,ys 
T; = Null set, 
Sa Null set. 


The corresponding coverages are respectively 0, 1, 0, 0, 0, 0, 0. 
Taking \ = {3} we find by the definition of block groups that 


B, = T.US8,UT;, ry = () 


with probability 1. Thus Pr{C, < a} = 1 € I,(1, n). 

Taking \ = {1,2} we have B, = S,U7,US,, and C, = 1 with probabil- 
ity 1. Thus Pr{C, < a} = 0 2 I,(2,n — 1). 

The proof in {3} is in error on page 39, the seventh line from the bottom. 

3. Correction of a statement of results. In paper [2] on page 536 a “Statement 
of Results for Measure Theorists” is given. The theorem B,,; should read: 
Hold the n functions ¢; , g2,-°+* ,¢n and the probability measure fixed, then 
T” is mapped on B,, and the power measure yu” is carried by that mapping into 
a measure of B,. This measure is always n! /n + 1 times Lebesgue meas- 
ure. 

4. A proof; a further statement. In the proof on page 537 of paper [3], the 
problem is to show that the distribution of m — m variates is the same when ob- 
tained by two methods of calculation; more particularly, to show that, given 
that in a sample of n, one value falls in each of the sets A; , A2,--- , Am and 
the remaining n — m fall in B, then the distribution of the n — m in B is that 
of a sample of n — m restricted to B. The statement is made that the probabil- 
ity of the above, and in addition that the n — m falling in B, fall in R C B, is 

—__u(Ay)u(As) +++ w(An)u™"(B) 
(n — m)! 
times the probability that a sample of n — m restricted to B falls in R. To 
show that the distributions are identical, a further statement is needed: that 
for one variate in each A;, for p falling in R, and n — m — p in B — R, then 
the probability is equal to 
n! 


——— (Aj) «++ w(An)u” "(B) 
(n — m)! 


times the probability that a sample of n — m restricted to B divides p into R 
and n — m — pinto B — R. 

5. Restatement of theorems in [3]. As has been noted in Section 2 above, 
Theorems A*),,;: and B*,, fail when actual ties (coincident points) occur. 
The following redefinition of the block groups overcomes this difficulty and the 
proof follows as given in [3]. 





NONPARAMETRIC ESTIMATION IV 297 


Define S; and 7; as in (4.1) in [3]. 

DerFtnition 5.1. Let S; be given by the definition for S; where < is replaced 
by S and > is replaced by =. 

DEFINITION 5.2. The block group B, consists of the union of all S; with i in 
and all T ; not contained in any S,; with i not in X. 

Derinition 5.3. The closed block group B, consists of the union of all S; with 
i in d and all T ; contained in any S; with i in X. 

Using the above definitions, Theorems A>),,: and B%,, follow provided the 
‘‘m-system of functions’ is chosen so that all 7’; are reduced to points. 

A more general definition of block groups which will cover cases where the 
‘“‘m-system of functions’? does not reduce all cuts to points and which is identi- 
cal to that of (5.2) and (5.3) when all cuts are necessarily points is given by 
(5.4) and (5.5). 

Derinition 5.4. The closed block group B, consists of the union of all 8; with 
tim x. 

DeFinition 5.5. The block group By consists of the complement of Bea) where 
C(A) is the complement of with respect to the integers (1, 2,---,n + 1). 

According to the representation theorem in [3], we have a continuous joint 
distribution of variates U,;, U2,--:, Um. By means of monotone functions 
gi(U1), «++ , gm(Um) this continuous distribution is mapped into a discontinu- 
ous distribution identical to the distribution of ¥(w:), --- , Wm(Wm). 


Let S; = {(Ui, +--+, Un) | Ui > wmlig)}, 


S> = {(U,,---,Um)| Ui < U(tay), Us > us(tay)}, 


/ 


Sm {(U; es / U, < uy(tay), ae Us > Um(t¢m))}, 
Saint = {(U; a: Titel / U; < W(t), _ - = Un < Um(t(my) }. 


Also we have: 
* 


Sr = {gi(U1), +++ 5 gm(Um) | x(U1) > gi(u(ia))}, 


y* 


Se = {g.(U,), 22° ul U a) | (U1) < gi(ti(tay)), g2(U2) > go(ue(t2)))}, 


Seie+s = {qi(U,), ae gm(U m) | gi(U4) <. gi(w(ta)), Sear 19m(U m) < Jm(Um(t¢m))) }, 
and Sf , Sf, etc., are defined as Sf , Sz , etc., where < is replaced by < and 
> is replaced by 2. : 

Consider now the inverse mapping of the sets Si, S:,--- and 8,83, 
into the space of (U,, U:,--- , Um). We shall have 


g (St) CS; Cg (S?) 





298 D. A. S. FRASER AND R. WORMLEIGHTON 


because 
g(U.) > g(a) ~ Ui; > a gi(U;) 2 gia). 


Thus we have the following inequality for the corresponding coverages: 


* . ae 
cov (S;) S cov (Si) <= cov (S;). 


The proof follows directly from this relation as in section (9) of [3]. 


REFERENCES 
[1] H. Scnerré anv J. W. Tukey, ‘‘Nonparametric estimation I,’’ Annals of Math. Stat., 
Vol. 16 (1945), pp. 187-192. 
[2] J. W. Tuxey, ‘‘Nonparametric estimation II,’’ Annals of Math. Stat., Vol. 18 (1947), 
pp. 529-539. 
3) J. W. Tukey, ‘‘Nonparametric estimation III,’’ Annals of Math. Stat., Vol. 19 (1948), 
pp. 30-39. 





CONDITIONAL EXPECTATION AND THE EFFICIENCY OF ESTIMATES 
By Paut G. Hoe. 


University of California, Los Angeles 


1. Summary. A probability density function, f(x; 0), is considered for which 
there exists a sufficient statistic. It is shown, under certain regularity conditions 
on the family of distributions and on the class of estimates, that if there exists 
an unbiased sufficient estimate of 6, it will be unique. This result is used to 
show that when the regularity conditions are satisfied, the method of Blackwell 
for improving an unbiased estimate of 6 merely yields a natural estimate. 


2. Distribution of a sufficient statistic. For the purpose of proving the unique- 
ness of an unbiased sufficient estimate of @, it is helpful to find the form of the 
probability density function of a particular sufficient statistic from a knowledge 
of the form of f(z; 6). It has been shown [1], (2] under different sets of assump- 
tions that when a sufficient statistic exists, f(z; @) must possess the functional 
form 


(1) f(x; 6) = exp [g(@) + A(O)r(x) + s(x)), 


provided the range of x does not depend on @. 

ASSUMPTION 1. It will be assumed that f(x; 0) has the form given by (1). 

Koopman [1] proves (1) under the assumption that f(x; @) is analytic. Pitman 
[2] assumes only that d°f(x; @)/dxd@ exists but adds a differentiability condition 
on the density function of the sufficient statistic that is assumed to exist. 

Now consider the distribution of a particular statistic. From (1), the prob- 
ability density function for a random sample is 


(2) f(t, +++, an 5 8) = exp [ng(0) + h(0) >. r(x) + > 8(z,)]. 
t=] t=] 


The particular statistic to be considered here is z = <r(z,;). From a lemma of 
Lehmann and Scheffé [3], mild regularity conditions will insure the existence of 
a transformation to new variables z, 2, --- , t, such that the density function 
of the new variables may be expressed in the form 


(3) F(z, te, +++, tn; 0) = f(t, -**,2n539)/| J], 

where J is the Jacobian of the transformation and where the z’s are replaced 
by their expressions in terms of the new variables. The essential regularity 
conditions here are that r’/(x) # 0, except possibly on a set of measure zero, 
and that r’(z) is continuous. 


299 





300 PAUL G. HOEL 


AssuMPTION 2. Jt will be assumed that the lemma conditions are satisfied. 

If the assumptions made in [1] had been employed, these restrictions on r(x) 
would have been satisfied because then r(x) would be analytic. 

If (2) is substituted in (3), and then (3) is integrated over the range of the 
variables t, , --- , t, , the density function of z will be obtained in the form 


(4) p(z; 0) = L(z) exp [ng(@) + h(6)z], 


because exp [2s(z;)]/| J | does not involve 6 and thus its integral over the range 
of the ¢’s will be a function of z only. 

It is easily seen that z is a sufficient statistic because it suffices to show that 
S(t, +++, Xn 3 0)/p(z; 8) is independent of 6. From (2) and (4) it is clear that 
this ratio is independent of @. 


3. Relationship of sufficient estimates. If an estimate of @ is understood to be 
a single-valued function of the random variables z,,---, x,, and if certain 
derivatives exist, then any sufficient estimate will be a function of z. For, let y 
be any sufficient estimate. Then 


(5) Fryers yn 5 0) = Gy; O)H(K, ++ , In), 


where G(y; 6) is the density function of y. If 0 log f/d@ is calculated for both 
(2) and (5) and if the results are equated, it will follow that 


8 log G (y; 0) 
06 
This result shows that z is a single-valued function of y, when the derivatives 
exist. Conversely, since only single-valued functions of the variables 2, , --- , Zn 
are considered as estimates, ¥ will be a single-valued function of z. If the rela- 
tionship is z = 7(y) and y = T™'(z) is multiple-valued, y will be defined only 
on one branch. 

AssuMPTION 3. Jt will be assumed that g'(@), h'(6), and dG(y; 6)/80 exist for 
some value of @. 

The restriction that g’(@) and h’(@) should exist would be satisfied if the as- 
sumptions made in either [1] or [2] to arrive at formula (1) had been made 
instead of Assumption 1. The restriction that dG(y; @)/d6 should exist is a weak 
restriction on the class of sufficient estimates, y, being considered. 


= ngq'(6) + h’(6)z. 


4. Uniqueness. Suppose there exists an unbiased sufficient estimate of @. 
From the preceding section it will be a function of z, say w(z). Then, from (4), 


(6) / w(z)L(zeM@ Or? dz = @. 


Now suppose that there were two such estimates, say w,(z) and w.(z). Then, 
letting a = h(@), (6) would yield the following relationship: 


(7) [ w@L@e"a = [ w(z) L(z)e*"dz. 





EFFICIENCY OF ESTIMATES 301 


But from the theory of Laplace transforms ([4], p. 244), it follows that (7) 
implies that w;(z) = we(z) except on sets of measure zero, provided that w;(z)L(z) 
and w,(z)L(z) are integrable in every finite interval and provided that (7) holds 
for some interval of values of a. It is easily seen that the existence of (6) for 
all admissible values of @ insures the integrability condition. If f(a; 6) is defined 
for an interval of values of 6, (6) will hold for such an interval. Since, from 
Assumption 3, a = h(@) is continuous and from (1) is obviously not constant, 
a must exist for an interval of values also, and hence (7) will hold for such an 
interval. 

AssuMPTION 4. Jt will be assumed that f(x; 0) is defined for some interval of 
values of 0. 

The discussions of the preceding sections may be summarized in the follow- 
ing theorem. 

TueroreM. If Assumptions 1, 2, 3, and 4 are satisfied and if there exists an un- 
biased sufficient estimate of 0, it will be the unique such estimate. 


5. Efficiency. Let ¢ be any unbiased estimate of @ and let u be any sufficient 
estimate of 6. Then Blackwell [5] has shown that r 


(8) v = Eft| ul, 


which is the conditional expected value of ¢ for u fixed, determines an unbiased 
estimate of 6 whose variance cannot exceed that of ¢. If ¢ is not a function of u, 
the variance of v will be less than that of ¢. 

This device for improving the efficiency of an unbiased estimate appears 
promising. However, since v is a function of u and thus is a sufficient unbiased 
estimate of 6, and since it has been shown that, subject to mild regularity con- 
ditions, any unbiased sufficient estimate is unique, this device will merely yield 
the unique unbiased sufficient estimate. Since the statistic z may be found by 
inspecting f(z; @), it suffices to find the function w(z) which is unbiased in order 
to obtain the desired estimate, when such an estimate exists. From (8), the 
existence of an unbiased estimate insures the existence of an unbiased sufficient 
estimate. Since, from (2), the maximum likelihood estimate of @ is a function 
of z, a natural method for finding this unique estimate when it exists is to first 
find the maximum likelihood estimate and then, if necessary, determine what 
function of it will be unbiased. Formula (8) with u chosen as z could also be 
used to find this unique estimate. 


REFERENCES 

{1} B. O. Koopman, ‘On distributions admitting a sufficient statistic,’? Trans. Am. Math. 
Soc., Vol. 39 (1936), p. 399. 

(2) E. J. G. Pirman, ‘‘Sufficient statistics and intrinsic accuracy,’’ Proc. Cambridge Philos. 
Soc., Vol. 32 (1936), p. 567. 

). L. LenmMann anp H. Scuerrf, “On the problem of similar regions,’’ Proc. Nat. 

Acad. Sci., Vol. 33 (1947), p. 383. 
V. Wipper, The Laplace Transform, Princeton University Press, 1946. 
BLACKWELL, ‘‘Conditional expectation and unbiased sequential estimation,’’ Annals 
of Math. Stut., Vol. 18 (1947), p. 105. 





ene 


302 K. NAGABHUSHANAM 


LINEAR TRANSFORMATIONS AND THE PRODUCT-MOMENT MATRIX 


By K. NAGABHUSHANAM 


Institute of Mathematical Statistics, Stockholm 


. 

Using linear transformations G. Rasch has deduced Wishart’s distribution 
in his paper on “A Functional Equation for Wishart’s Distribution” (Annals of 
Math. Stat., Vol. 19 (1948), pp. 262-266). This note is of the nature of some 
observations on the Jacobian of the transformation induced by a linear trans- 
formation of coordinates with constant coefficients in the distinct elements of 
the product-moment matrix of a sample of n vectors, each of k components, 
drawn from a universe of a normal k-variate distribution with zero means. If 
the r-th vector of the drawn sample has components (x{”’), r = 1, 2,---, , 


9 “5 


i = 1,2,---,k, the sum of the products of the 7-th and j-th components of each 
of the n vectors is denoted by 
n 
a0 
Mj = Z Ui Uj. 
r=] 
Let the variables of the vector variate be 2, %2, --- , 2, or shortly (x), and 
let a nonsingular (i.e., reversible) linear transformation of the variables with 
constant coefficients be made from (x) to (y), viz., 


k 
te = Do an Yi oe a Se, 


i=l 

The distinct elements My, My,---, Mi, Mo, Mo3,---, Mie undergo a 
consequential or induced transformation which is also linear in terms of the 
corresponding elements of the product-moment matrix |! 41; |! of the same n- 
vector sample in the coordinates (y). 

Let the matrix of the coefficients of the induced transformation which is 
also the matrix of partial derivatives in this case be denoted by || J ||, and let 
its determinant which is the Jacobian of the transformation be denoted by J. 
The elements of || J || are functions of the elements of || a,, ||. When |! a,, |! is 


in the diagonal form, so is also || J || with elements ayay , ayan, - 
i. a 


* 5 Qik , 
Ao2Q22 , A22033, -** , AxxQxe . In this case we have J , where A stands for 
the determinant of || a,; || which is nonzero on account of the nonsingularity 
of the transformation considered. It is then natural to ask the question whether 
it can be asserted that the same relationship holds even when the matrix || a,; || 
is not in the diagonal form or reducible to it. The answer is in the affirmative, 
the result being a particular case of the following theorem of Escherich (see 
C. C. Macduffee, Theory of Matrices, Ergebnisse der Mathematik und ihrer 
Grenzgebiete, Julius Springer, Berlin, 1933, p. 86, theorem 44.3): The determinant 
of the m-th power-matrix' with a nonvanishing determinant A is A°**-?°*, where 


1 Tt may be noted that the m-th power-matrix mentioned in Eschetich’s theorem is not 
the matrix multiplied by itself by the ordinary matrix multiplication. For its definition see 
Macduffee, loc. cit., p. 85 





PRODUCT-MOMENT MATRIX 303 


k is the order of the square matrix whose déterminant is A. Here || J |, is the second 
power-matrix, and so we have in all cases J = A***. 

Due to the importance of this in connection with Wishart’s distribution, the 
following observations are of special interest. 

(1) When the latent roots of the matrix || a,; || are all distinct, it is reducible 
to the diagonal form, and it has been shown by Rasch that in such a case J = 
A** and the method of analytic continuation has been suggested as a means of 
establishing the same result when not all the latent roots are distinct. We shall 
show this here by a consideration of the limit of a polynomial function. 

Let us suppose that some of the latent roots are repeated. Let now a suitable 
number of the elements of the matrix be replaced by neighbouring values, i.e., 
by a,; + en; , 80 that the latent roots of the altered matrix, || a,; || say, are all 
distinct. (This is always possible as there are an adequate or more than adequate 
number of elements for this purpose.) We shall now consider a linear transforma- 
tion from the variables (x) to the variables (y) with || a,; || as the transformation 
matrix. Using primes to denote the corresponding quantities in relation to the 
new transformation of the variables, we have J’ = (A’)***, since || a,, || is re- 
ducible. to the diagonal form. Further J’, expressed as an expansion in its ele- 
ments, is a polynomial function whose continuity properties yield by proceeding 
to the limit 

J = lim J’ = lim (A’)*"" 


e-0 e-0 
= (lim A’)** 
e0 


a At 


(2) The following situation is sometimes met with. Consequent on the linearity 
of the transformation of the variables with constant coefficients we have 
k 


dz, = Do an dy; (r = 1,2,---k) 


t=1 


which gives 


k k 
d(z$? z$”) = a 7, jt Ajm A(yi”’ Ym’). 


m= 


Summing over the n values of r, we have dM ,; transformed in terms of (dM im) 
with the same coefficients as those that appear in the transformation of dz; dz; 
in terms of (dy; dym). Also we know that 


dMy dM, -:: dMy. dMn dM» --: dMix 
= J-dMy dMy: --- (My dM dMy «++ dM. 
By the sameness of coefficients of transformation we can write 
dx, dx, dx, dx. +++ dx, dx, dx, dre dx, dr; +++ dx, dry 


= J-dy dy; dy dy2 «++ dy: dyx dys dy2 dys dys +--+ dye dys ; 





304 FRANK J. MASSEY, JR. 


that is, ° 
(day data: + - dt)" th = J. (dy, dye: - dy; o_ 
But dx,dirz--- da, = A-dy,dy2 --+ dy, , so that J = A***. These formal opera- 
tions show that in this case the differentials when multiplied in the usual way 
work like the determinants they signify. 
(3) After obtaining that J = A*™ 
hold good. 


tasch’s functional equation is seen to 


(4) When the constant in Wishart’s distribution is evaluated in Rasch’s nota- 
tion using H. Cramér’s method (see Mathematical Methods of Statistics, Princeton 
University Press, 1946, pp. 390-393), it will be found that a power of n is missing 
in the numerator. This is due to the fact that we have not considered the estimate 
1/n || M;; || but worked with || M ;; }}. 

I thank Professor H. Cramér and Mr. G. Blom for their interest in this work. 


—_—_—_—_Ee———— 


A NOTE ON A TWO SAMPLE TEST’ 


By Frank J. Massey, Jr. 
University of Oregon 


1. Summary. Mood ([1], p. 394) discusses a test for the hypothesis that two 
samples come from populations having the same continuous cumulative frequency 
distributions. It consists of arranging the observations from the two samples in 
a single group in order of size and then comparing the numbers in the two samples 
above the median. This technique is extended to using several order statistics 
from the combined samples, and to the case of several samples. The test is non- 
parametric and might be a good substitute for the single variable of classifica- 
tion analysis of variance in cases of doubtful normality. The application of the 
test would be the same as in Mood ({1], p. 398) except that there would be more 
than two rows in the table. 


2. The distribution function. Suppose we have p populations all having the 
same continuous cumulative distribution function F(x). Let X;; (¢ = 1,2---, 
P;j = 1,2---, mn) be the jth observation in a sample of size n; from the ith 
population. Let > es ni =N. 

Arrange these N observations in a single series according to size and rename 
them z; < 2 < --- < zy. We choose k — 1 of the z values, for example, 
Za,» Zar» °** » Zay_, (the a; are integers and 1 < a < a < --- < N). De- 
note by m;; the number of observations X,, such that za;_, < Xi, < 2a; for 


j = 2,3,---,k — 1, by my the number of Xy, < za,, and by my the num- 
ber of X;, > zZa,_, . These can be illustrated by the following table. 


‘This paper sponsored in part by the Office of Naval Research, Contract N6-onr- 
218/IV, Project NR 042 063. 





TWO SAMPLE TEST 


Ist sample | 2nd sample | ‘++ | pth sample 


Mik Mo J cee | Mok > maz = N 


t=1 


| 





| 








mi ma 





| 


k 
: mji=m | 
j=l 


k k ; 
Mei = NM =: Mpj = Np 
j=l j=l 


The joint distribution of the m;; and z; can be written as 


Pp 
Il nj! 
Pe) dF @)F@) — FQ) dF) -- 
m,;! 
t—l jel 


[F(eu) — Faso" {l — F (ze-1))"~™* AF (24_1) 
*ZmMg,1 Ms,2 M3y3 ++ * Ms, k-1, 


where the sum runs over all possible sets of values of 8; = 1, 2, 3, --- p. It is 
easy to show that this sum is equal to the product 


or; (are = oi) (ax = a2) at (aa — G&-2). 


Now the joint distribution of the m;; is obtained by integrating the z’s over 
their entire range z; < z;4:. This is a type of Dirichlet integral and we get 


date od 


t=] j=l 


Pp k 
NII II m,;! 


t—1 j=l 


f(my +++ mp) = 





306 ALBERT SADE 


where a, = N, ao = 0. This is the distribution of cell frequencies in a p by k 
contingency table with all totals fixed when there is independence (see [1], p. 
278) and thus, for large values of m,; at least, the usual chi-square test can be 
used. 


REFERENCE 
{1} A. M. Moon, Introduction to the Theory of Statistics, McGraw-Hill Book Co., 1950 


eR 


AN OMISSION IN NORTON’S LIST OF 7 X 7 SQUARES 
By ALBERT SADE 
Marseille, France 


1. In a previous paper the value 16,942,080 for the number of reduced 7 X 7 
squares was obtained by the author by an exhaustive method, subject to a 
strict control ([4], Section 20). This number exceeds Norton’s ({2], Table on 
p. 290) by 14,112. An attempt was made in Section 21 of [4] to show that this 
discrepancy in the total number does not affect Norton’s conjecture ((2], p. 291) 
that the 146 species represent the whole of the universe of 7 X 7 Latin squares. 
However, R. A. Fisher has informed the author that the discrepancy cannot. be 
explained away in this manner. It has therefore to be attributed to a gap in 
Norton’s list. 


2. Now, a 147th species containing 14,112 squares can arise only from an 
automorph type through an operator of the order 5‘. It is easy to construct a 
matrix Q corresponding to such an operator as, for example, T = (34567)°. 
Here the cycle (34567) signifies a permutation [1] of columns, a permutation of 
rows and a substitution of elements. 

The first two rows of Q are respectively identical with the first two columns 
and define the substitution (12) (34567). In the remaining 5 X 5 squares, it is 
necessary that the elements of the broken diagonals follow in the natural cyclic 
order, except the numbers 1 and 2, which each form a broken diagonal. 

The square is given below: 


] 


] 
6 2 5 3 


3. On replacing each row of Q by the conjugate permutation and rotating the 
figure through an angle of 180° about the diagonal, we obtain the square 





7 X 7 SQUARES 


aor WN bo 
e100 Ne mS oO 
—- WON Oe 


It is easy to verify the equality 
R(12-36475)(12-34567) = Q, 


in which the first factor is a permutation of columns and the second a substitu- 
tion of numbers. 
Thus the number of reduced squares produced by Q is 


6(7!)°/(3-5-7!6!) = 14,112, 


which is precisely the difference mentioned in Section |. 

4. The reversal of the unique intercalate 12 in Q gives a square S isomorphic 
with Q, and on interchanging rows and columns and the numbers 1 and 2 in S, 
we obtain Q again. Thus S is one of the 14,112 squares considered in Section 3 
and does not give a new species. Therefore, Norton’s conjecture ((2], p. 291) 
“that they can be enumerated by an exhaustive reversal of intercalates’’ is not 
borne out, at least for species with one intercalate. This assumption was founded 
on its truth for 6 X 6 squares; but it is to be expected that the classification of 
n X n squares would become more complicated with increasing n. 

5. On the contrary, the conclusion of S. G. Ghurye [3] is confirmed, for the 
square Q possesses a different “J — A” from those of other species. 


REFERENCES 

{1} T. B. Spracue, ‘‘A new algebra, by means of which permutations can be transformed 
in a variety of ways, and their properties investigated,’’ T’rans. Roy. Soc, Edinburgh, 
Vol. 37 (1892), pp. 399-411. 

[2] H. W. Norton, ‘“‘The 7 X 7 squares,”’ Annals of Eugenics, Vol. 9 (1939), pp. 269-307. 

{3} S. G. Guurys, ‘‘A characteristic of species of 7 X 7 Latin squares,’’ Annals of Eugenics, 
Vol. 14 (1948), p. 133. 

[4] A. Sape, Enumération des carrés latins. Application au 7*™ ordre. Conjectures pour les 
ordres supérieurs, privately published. 





RANDAL H. COLE 


RELATIONS BETWEEN MOMENTS OF ORDER STATISTICS’ 
By Ranpbat H. Coie 
University of Western Ontario 


Summary. Moments of order statistics multiplied by appropriate factors are 
called normalized moments. These normalized moments are shown to be suc- 
cessive differences of normalized moments of largest order statistics. 


1. Introduction. After discovering the following relations, the author learned 
in conversation with H. O. Hartley that similar relations had been used by him 
and others at the University of London. They did not, however, recognize the 
advantage of expressing them in terms of what are here called “normalized 
moments.” The extreme simplicity of the relations is a direct result of this 
device. 


2. Derivation of the relations. Let 2, > 2%, > --: > 2nin be the order 
statistics from a sample of size n. Let gi,(x) be defined by 
(1) gan(x) = F*“*(x)[1 — F(a)\*'f(@), 


where f(x) and F(x) are, respectively, the pdf and edf of the population from 
which the sample is drawn. Then, the pdf of z,,, is 


nC gijn(2), 
where C} = m!/[j!(m — j)!]. If r is any integer such that r < 7, we may write 


(1 — F(z)" = 1) — F(z) — F(x)” 


2 


(1 — F(x) DY (- yer a). 
j=0 


Substituting this in (1), we have 


(2) gan(t) = 24 (—1) CF grin-s(2), r<Sicn. 
j=0 
The relations (2) may be written in matrix form. By letting C}; = 0 for s > r, 
and considering the expansion of [1 — (1 — y)]’, it can be shown that the in- 
verse of the (n + 1) by (n + 1) matrix 


P, = (C}), 


=} \itiny 
2 = 
F = ((- 1) ( j)s 
1 Prepared in connection with research carried out at Princeton University and spon- 
sored by the Office of Naval Research. 





MOMENTS OF ORDER STATISTICS 


We shall introduce the symbol g}...¢1.(z) to represent the vector 


(pin(z), 7a oe GJain(x)). 
Similarly let gi\p....(z) represent the vector 
(gup(x), i et 9i\4(2)). 


The vector gp...¢in(x) will be the transpose of the vector g)p...¢jn(z), etc. A similar 
notation will be used for vectors whose components are moments. 
Using these conventions, the relations (2) can be written 


Gr-entn(Z) = Pais Grin-.-2(2), 
or, by inverting, 
(3) Grin--a(2) = Pe Gr...nin(Z)- 


Let wij, represent the tth moment of z,,, . The omission of ¢ in this designation 
simplifies the notation and can lead to no confusion, since all the relations be- 
tween moments are independent of ¢. Let the quantity 


~ 
t 
Vin = [ Jiin(x) x dx 
a 


be called the normalized tth moment of x, . Evidently 
yn—l 
Mijn = NC §1 Pajn - 

If relations (3) are multiplied by z‘ and integrated, we have, in terms of the 
vector notation previously introduced, 
(4) Pace. © PantBeiveins <n. 

From this fundamental relation, two special cases of interest can be written 
down. First, by letting r = 1, we have 

Vijn---1 = Py-1 nin - 


Second, because of the triangular nature of P,_-, we may delete the last k 
components of the two vectors in relation (4), make a corresponding reduction 
in the order of the matrix, and write 


Vein---r4k = | a n—kin + 
In particular, if k = n — r — 1, we obtain 
Vein—1 = Vrin + Ve+iln - 


That is to say, the normalized moments for a sample of size n — 1 can be ob- 
tained by summing adjacent pairs of normalized moments for a sample of size n. 
It follows that the normalized moments for all samples of size less than n can 
be obtained, by the simple operation of addition, from the normalized moments 





ratenceir nest knet 


See eth tA RAN 


310 ABSTRACTS 


for a sample of size n. By reversing the process, it is clear that if vj, , ijn, 

* , %4j, are known, the normalized moments for all samples of size no greater 
than n can be determined by successive differencing, although in this case there 
is a progressive loss of significant figures. 


a 


CORRECTION TO “THE PROBLEM OF THE GREATER MEAN” 
By Racuu Ras BAHADUR AND HERBERT ROBBINS 
University of Chicago and University of North Carolina 


In the paper mentioned in the title (Annals of Mathematical Statistics, Vol. 
21 (1950), pp. 469-487), the paragraph on page 484 beginning ‘‘We have given 
no criterion . . .”’ is erroneous, and should be omitted. The following paragraph 
would then read: ‘Let us suppose that 2 is given by (33). Then f (v) is admissible 
and minimax, by the preceding paragraph. There is, however, another reason 
for preferring f'(v) .. ..” 

We remark that in case a point on the plane {w:m, = mg} is an interior point 
of 2 and the risk function is 7, then (contrary to statements in the erroneous 
paragraph) f (v) possesses the following property. If f(v) is a decision function 
such that f(v) # f (v) and 

sup 7(f | w) < sup #(f° | w)(= 4), 


weD we 


= 0 : - 7 : . . ° . ° 
then F(f | w) < F(f | w) for all w in Q, the inequality being strict whenever m, * me. 


- al) * . * -« e . . . . . . 
It follows that f (v) is the unique decision function which is admissible and minimaz. 
A proof of this remark is contained in an unpublished paper by R. R. Bahadur 
entitled ‘““A Property of the ¢ Statistic.”’ 


a 


ERRATA 


By P. V. Krisuna [YER 
University of Oxford 
In the author’s paper ‘“The theory of probability distributions of points on 
a lattice” (Annals of Math. Stat., Vol. 21 (1950), pp. 198-217), read ‘2 &K 2 & 3” 
for “2 X 3 X 3” on page 211, line 22, and on page 213, Table 8, heading. 


(a 


ABSTRACTS OF PAPERS 


(Abstracts of papers presented at the Oak Ridge meeting of the Institute, March 15-17, 1961) 


fs 


1. Confidence Intervals for the Mean Rate at Which Radioactive Particles 
Impinge on a Type I Counter. (Preliminary Report.) G. E. Avpert, Univer- 
sity of Tennessee and Oak Ridge National Laboratory 





ABSTRACTS 311 


The number of particles impinging on a Geiger-Mueller counter in a time interval of 
length ¢ is assumed to be a random variable with a Poisson distribution of mean at. Starting 
with Feller’s results for a Type I counter given in his paper “On probability problems in 
the theory of counters” in the Courant Anniversary Volume, 1948, it is shown that the count 
N registered by the counter in time ¢ has the distribution: Pr(N = m) = Oif m2 (t/u) + 1, 
and Pr(N = m) = exp (—A)LgamA*/k!, A = alt — (m — 1)ul], if m < (t/u) + 1, where u is 
the dead time of the counter. Confidence interval charts for the parameter b = au for various 
values of t’ = t/u are prepared by the usual inversion procedure. If N andt—Nu are both 
large, approximate confidence intervals for the parameter a take the simple form 


(N + z,N¥)/[t — (N — 1)ul, 


where z, is the two-tailed percentage point of the normal distribution for the confidence 
level 1 — p. 


2. A Problem of Elapsed Times in a Sequence of Events. OsmMer CARPENTER, 
Carbide and Carbon Chemicals Division, Oak Ridge. 


The problem considered refers to a series of random events forming a sequence in time 
or space, for example, the emission of particles by radioactive matter. From a sequence f 
of such events, a derived sequence g is formed by selecting from f all those events which 
follow the preceding event by an elapsed time greater than a given constant, U = 0. The 
times between successive events in the sequence f are given to the independently distributed 
by a known distribution function, F(t). It is required to find the distribution functions of 
elapsed time and of the number of counts per fixed time interval for the derived sequence, 
g. A general method is applied to the solution of the exponential case, F(t) = ke~**. 


3. On the Existence of Unbiased Tests for Testing Composite Hypotheses. 
EstTHEerR SEIDEN, University of Buffalo. 


The following problem was suggested by J. Neyman. Let X be an observable random 
variable, multivariate or not, and H a composite hypothesis concerning X. Let H denote 
a hypothesis, concerning X, alternative to H. Finally, let a be a chosen level of significance. 
What restriction should one impose on the hypotheses H and Ai in order that there exists 
a critical region w such that (i) P(X ew! H) = a, and, whatever be the simple hypothesis 
h e A, (ii) P(X ew! h) > a? It is shown now that if H as well as H consists in assuming 
that the random variable X follows a continuous distribution law, then there exists always 
the most powerful region w satisfying conditions (i) and (ii), provided that the distributions 
belonging to H and AW are linearly independent. If H and A are infinite families of absolutely 
continuous distributions and condition (i) is replaced by (i’) P(X ew| H) S a, then for 
some a less than } there exists a region w satisfying conditions (i’) and (ii), provided that 
the convex closures of H and Hi are disjoint. 


4. Group Divisible Incomplete Block Designs. R. C. Bose, University of North 
Carolina. 


An incomplete block design with v treatments each replicated r times in b blocks of size 
k is said to be group divisible if the treatments can be divided into m groups each with 
n treatments, so that the treatments of the same group occur together in A, blocks and 
treatments of different groups occur together in Az blocks, \; # \.. The parameters are 
connected by the relations v = mn, bk = vr, A: (m — 1) + Aen(m — 1) = r(k — 1). It is shown 
that these designs fall into three classes: (i) singular for which r = }, , (ii) semiregular 
for which r > d; , rk = vd, (iii) regular for which r > d, , rk > vr. . It is proved that for 

i 


regular designs b = v, and for semiregular designs b = v — m + 1, every block containing 





: 
: 
: 
; 
{ 
: 
5 
: 
; 
£ 
: 
i 
4 
¥ 


312 ABSTRACTS 


the same number of treatments from each group. A singular design is always derivable 
from a balanced incomplete block design by replacing each treatment by a group of n 
new treatments. When 6 = v the quantity (r — \,)” (rk — vd:)""'! must be a perfect 
square, and the Hasse invariant of NN’, where N is the incidence matrix of the design, must 
be +1. The value of this invariant has been calculated in terms of the parameters. The 
parameters for all group divisible designs with r < 10, k S 10, whose existence is not ruled 
out by theorems stated above, have been listed. Combinatorial solutions for most of these 
have been derived, though there remain a number of unsolved cases. The analysis of vari- 
ance and the equations for intra- and inter-block estimates have been given. These designs 
are likely to prove useful both in varietal trials and in factorial experiments. 


5. Orthogonal Arrays of Strength Two and Three. ht. C. Bost anp KENNETH 
A. Busu, University of North Carolina. 


Consider a matrix A = ((a;;)) with m rows and VN columns where each element a;; repre- 


sents one of the s integers 0,1, 2, --- ,s 1. The columns of any t-rowed submatrix of A 


provide N ordered t-plets. The matrix A is called an orthogonal array (V, m, s, t) of size 
N, m constraints, s levels, and strength ¢ if each of the C7 partial t-rowed matrices formed 
from A contains all the s‘ possible ordered t-plets each repeated \ times (N = As‘). The 
known upper bounds for the number of constraints when ¢t = 2 and 3 have been improved: 
Ifx — 1 = a(s — 1) + 6,0 S b < s — 1, and nis the largest positive integer (including 


0) consistent with s(b — 2n) = (6b — n) (b — n + 1), then for the case t = 2, 


m S I{(As? — 1)/(s — 1)] — n — 1, and for the case t = 3, m S/[(As? ++ 8 — 2)/(s — 1)] — 
n — 1. Methods of constructing orthogonal arraysofstrength 2 and 3 have been investigated. 
A difference theorem enabling the construction of the arrays (18, 7,3, 2) and (32, 9, 4, 2) 
has been proved, and it is shown that if s = p”,\ = p*“, where p is a prime, then we can 
construct the array (As*, m, s, 2), with 


m=] + pte + «++ 4 prete - pretets 
where u = rn + d,0 Sd < n,r 2 0. Another theorem connects finite projective geometries 
with orthogonal arrays, and is used to construct the arrays (i) (s*, s +- 2, 8,3) whens = 2"; 
(ii) (s*, s + 1, s, 3) when s = p", p being an odd prime; (iii) s*, s? + 1, s,3) whens = p”, 
where p is a prime; (iv) (s’, s*~!, s, 3) when s = 2. Orthogonal arrays are useful in connec- 
tion with many problems of experimental design. 


6. The Structure of Balanced Incomplete Block Designs, and the Impossibility 
of Certain Unsymmetrical Cases. Wittiam 8. Connor, University of North 
Carolina. 


If a; is the number of treatments common to the ith and jth blocks of a balanced in- 
complete block design with v treatments, b blocks, r replications, and k treatments per 
block, with any two treatments occurring together in the same block \ times, then the 
characteristic matrix C = ((e, of the design may be defined by c,; = (r — k) (r — X) 
1,2,-°: ,v,c;; = kX — ra 


i= 


=1,2,--- ,v,i #j. If | Cy | is any symmetrically chosen 
partial determinant of order r belonging to C, we prove that (i e 


+ is nonnegative; (ii) 
ift = bd v, then | C, | k(r Ajr-8 


r*'is a perfect square; (ili) if / > b »|Ce| = 0. 
From (i) Fisher’s inequality b 2 v is deduced and it is shown that 


k+rAX—-rsajsr—-rAX—k-+ 2k/r. 


=9),A =2 b)v = 36,6 = 


r= 10,k = 8, = 2; (c) v = 21, b = 28, r = 8, k = 6, A = 2 is studied. For (a) and 
it is proved that there must exist b — v blocks, the | C 


The structure of the designs (a) 1 15,5 = Z1,r = 7k 45, 
1 (b 


, for which contradicts (ii). For 





ABSTRACTS 313 


(c) it is shown that if the incidence matrix N is augmented to No by adding 7 suitably chosen 
row vectors then the Hasse invariant C,(NoNo) for NoeNg is —1, when p = 3. This demon- 
strates the impossibility of (a), (b), and (c). The last two results are new. (Research car- 
ried on under the sponsorship of the Office of Naval Research.) 


7. Some Bounded Significance Level Tests for the Median. Joun E. Waxsn, 
The Rand Corporation. 


In practice it is often permissible to assume that the observations of a set are statistically 
independent and from continuous populations with a common median. This is the case, 
for example, if the observations are a sample from a continuous population. Then the 
population median can be investigated by using the sign test. For small numbers of ob- 
servations, however, the sign test does not furnish very many suitable significance levels. 
Also, some of the sign tests with suitable significance levels are not very efficient. This 
note presents some tests whose significance levels are only approximate but cover a wide 
range of suitable values. The significance levels of these tests are exactly determined if the 
populations are also symmetrical; they are bounded otherwise. Some of these bounded 
significance level tests have high efficiencies. 


8. Joint Sampling Distribution of the Mean and Standard Deviation for Distribu- 
tion Functions of the First Kind. Metvin D. Sprincer, U. 8. Naval Ord- 
nance, Indianapolis. 


Consider a universe characterized by the distribution function f(z), -» < 4 < o@, 
If n variates z; ,i = 1,2,--- , nm, are selected at random from this universe, the probability 
that they will fall simultaneously within the intervals dz; , i = 1, 2, --- , n, is given, to 
within infinitesimals of higher order, by f(2:)f(x2) --+ f(an) dx, dx, --- dz, . Asan immediate 
consequence of the definitions of and s one may employ the transformation T: x; = 2, 
Xo = To, °°* , Eng = In-2, In-1 = (n#é — > ie + 2,)/n, z, = n# — DP ‘2;, where Q,= 
[(—3 27°25 — 205212 7a ¢2eZ p41 + Wk Dt’ x; — n(n — 2)#2 + 2ns?}!. Application of this trans- 


=} ‘ jl 


formation gives 
S (ai) f (a2) «+ -f(an) da, dag +++ dry 
= f(zi)f (a2) «+ f(an—a)f((n® — DI *x; — 2;)/2) 
n~-2 


f([n®? — 2, 2 + Q)/2) | J | dx, dzz --+ dxn_. d# ds, 


where | J | = | Jacobian of 7 | = n®s/Q,; . Evaluation of the multiple integral 
F(z, s) = lf ee [ sore -++ f(n2)f([né — ae — 2,)/2) 


-f ([né - ee “+ 2,]/2) 2n?s/Q; d2n—2 eee dz dz, 


yields the joint distribution function F(Z, s). The limits of integration are established by 
employing the relationships Djz; = n# and Tjz} = ns* + n#, together with mathematical 
induction, to prove that z,_,, r = 2, 3, --: , nm — 1, is restricted to the closed interval 


(Int — SP" x, — 2) /(r + 1), [nt — DP" a; + ]/[r + 1), 


where 


m-r—-1 2 A-Trtqa-r—2 
Q, = [— r(r + 2)2; 2 — WB, Vpeg Wiis 


eopeui 
+ Qrn#D. "a — rn(n — r — 1)82 + (r + 1)rnsifi. 





314 ABSTRACTS 


9. On Certain Distribution Problems in Multivariate Analysis. (Preliminary 
Report.) INGRAM OLKIN, University of North Carolina. 


This paper is concerned with the derivation of the joint distributions of (i) rectangular 
coordinates, (ii) correlation coefficients, (iii) characteristic roots of a matrix, and (iv) 
roots of a determinantal equation, starting in each case from the multivariate normal 
distribution. Consider a set of pn random variables following the distribution law f(X,n) = 
K exp (— 4 tr XX’), Xa p X n matrix, and the real transformations: (1) X = (T O)e“, 
where T (p X p) is a triangular matrix with t;; = 0 (i <j), A (n X n) is a skew-symmetric 
matrix; (2) X = D.(U O)e®, UU’ = R, where D, is a diagonal matrix with ele- 
ments a, ,--: ,ap, U (p X p) is a triangular matrix with ui; = 0 (i < 9), Diu? ; =li= 
1, --- , p, B (n X n) is a skew-symmetric matrix, Rk (p X p) is a symmetric matrix; (3 
X = e°(D, O)e”, whereC (p X p) andD (n X n) areskew-symmetric, D, isa diagonal matrix 
with elements uw: , --- , 4p , Where uw? are the characteristic roots of XX’. Using these trans- 
formations on f(X, n), (i), (ii), and (iii) are obtained. From the distribution law 


t( Xi » mh \f (Xe » M2) 

‘ , , . + . F , . 
and the transformation (4) X, = Y(D,O\e ,X: = Y(D.O)e , where Y (p X p), E (m X m) 
and F (nz X nz) areskew-symmetric matrices, D, and D, are diagonal matrices with elements 
8, °°: ,8pandc, ,Cp respectively, such that s? + c? = 1, the joint distribution of 6 


= s?is found, where @ are the roots of | X;X; — 0(X,:X1 + X:Xq) | = 0. 


10. A Unified Approach to a Wide Class of Distribution Problems in Multi- 
variate Analysis. 8. N. Roy, University of North Carolina. 


(1) X being a p X n matrix of random observations (reduced to means) from a p-variate 
normal population, and it being known that there exist a p X p triangular matrix 7, and 
a p X n matrix L (both ordinarily uniquely determined) such that X = TL and LL’ = TI, 
it is of interest to obtain the sampling distribution of 7 from which various distributions, 
including those of partial and multiple correlations, would easily follow. (2) X; and X: 
being p X nm and p X nz matrices of random observations (reduced to means) from two 
p-variate normal populations and it being known that there exist (ordinarily uniquely) a 
p X p matrix Z, and a p X n, matrix L; , ap X nm matrix L» (with the constraints L,L; = 


LL = 1), and p X p diagonal matrices D, and D, (where S; = Sin 6; ; C; = Cos 6 ;i = 
1,2,--- , p) such that XY, = ZD,L, , X2. = ZD-_Lz, it is of interest in multivariate analysis 
to obtain the sampling distribution of 6 (= 6, , --- , @,). (3) With the same X matrix as in 
(1), and it being known that there exist (ordinarily uniquely) a p X p orthogonal matrix 
r,ap X nmatrix WM (such that MM’ = J), and a diagonal matrix D;(p X p) such that 
X = TDM, it is of interest to obtain the sampling distribution of ¢ t, , te, °°° , by). 
With the help of the constraints indicated one could knock out any p(p + 1)/2 out of L in 
(1), out of each of L, and Lz» in (2), and of each of T and M in (3); denote the remaining 
elements respectively by Lp, (Lig , Lor(, and (Tg , Wg). Thenin (1), (2), and(3), respec- 
tively, we change overfrom X to(T, Lz), from (X, , X2) to(Z, 6, Lip , Log), and from X to 
(Tp , t, Mpg). This is made easy by an artifice discussed in the paper, and the way L in (1 

(ZL, , L2) in (2), and M in (3) occur, makes it easy to integrate out over them leaving us 
with the distributions of T in (1), (Z and @) in (2), and (T and £) in (3). From this the null 
distributions of 7 in (1), @ in (2), and ¢ in (3) follow with great ease. Certain nonnull dis- 
tributions would also come out without much difficulty. 


11. An Extension of the Buffon Needle Problem. Naruin Manren, National 
Cancer Institute. 


Historically, the Buffon needle problem is concerned with the estimation of the value of 


m from the probability of intersection of a straight line of fixed length (< 1) with a series 





ABSTRACTS 315 


of equally unit-spaced parallel lines, on which the straight line is allowed to fall at random 
The present paper extends the problem to the estimation of x from the average number of 
intersections of a straight line of any fixed length with a series of equally spaced parallel 
and perpendicular lines on which the straight line is allowed to fall at random. It is also 
shown that, comparatively, very precise estimates of x can be made, for long straight lines, 
from the variation in number of intersections rather than from the average number of 
intersections. From purely statistical considerations it is demonstrated that + must lie 
between 3.1231 and 3.1752, with no necessity for any measurements being made. 


12. A Generalization of Sampling without Replacement from a Finite Universe. 
D. G. Horvitz anv D. J. THompson, Iowa State College. 


Let the finite universe consist of N elements U; (i = 1, 2, --- , N). A sample of n ele- 
ments is to be drawn without replacement and the total 7' of some character X of the ele- 
ments estimated from the sample. Denote by P(U;) the probability that the ith element 
will be included in a sample of size n. An unbiased estimator T = Y%_,2:/P(U;) is proposed, 
and expressions for the variance of this estimator as well as an unbiased estimator of this 
variance are given. An extension to a two-stage sampling scheme is presented. Considera- 
tion is given to methods of determining selection probabilities which will result in optimum 
probabilities P(U;) on the basis of the prior information available on the elements of the 
universe, and two approximate methods are illustrated. 


13. A Problem in Two-Stage Sampling. B. M. Sre.einper, University of North 
Carolina. 


Charles Stein has suggested a two-stage sampling plan, the size of the second part of 
the sample depending on the information supplied about the variance of the population 
by the first part of the sample. In his work, the size n, of the first part of the sample is left 
to the discretion of the experimenter. This study is designed to throw further light on the 
choice of the value of n, . For this purpose the expected value of the total sample size n 
for given n, has been computed for four different significance levels a = .1, .05, .02, .01 
and varying c = d/o, where d is the allowable discrepancy. These values are presented in 
four tables where c ranges from .01 to 1.0 and n; ranges from 5 to 72,000. It is shown that 
the computation can be made to depend on the knowledge of Pearson’s incomplete Gamma 
function. An approximation whereby the computation can be made to depend only on the 
knowledge of the normal distribution function has also been developed. Numerical evidence 
for the adequacy of the approximation for moderately large values of n; (n; 2 61) has been 
adduced. Limiting values for the expected value of the total sample size are given for 
fixed n,; and a with varying c. The discussion of the use of the tables covers the different 
sampling situations which may arise: (i) an approximate estimate of o is available, (ii) 
only a rough estimate of ¢ is available. Reasons are given which point to 250 as the upper 
limit for m, in a two-stage sampling plan. 


14. Bounds on a Distribution Which Are a Function of Moments to Order Four. 
(Preliminary Report.) Marvin ZELEN, University of North Carolina. 


Let F(y) be a cumulative distribution function defined for the random variable a < y < b, 
and x be a known quantity. Markov and Stieltjes considered the problem of finding the 
inf pri F(x) and the sup r,,) F(z) as a function of a finite number of moments of the dis- 
tribution. This present paper investigates the explicit expressions for these bounds if the 
moments to order four are known (1) in the case when the random variable has finite range, 
(2) in the case when the random variable has infinite range. In the applications of these 
bounds, it is necessary to order roots of certain orthogonal polynomials. It is suggested 
that for ready application, a nomograph be used. These bounds would be useful when one 





a se hat CMA IRAE DD ALDINE 


nme 


316 ABSTRACTS 


is confronted with a cumulative distribution function which is unknown or difficult to 
handle. 


15. An Inconsistency among Type A Regions. Herman Cuernorr, University 
of Illinois. 


In a test of a hypothesis one may regard a sample in the critical region as evidence that 
the hypothesis is false. Let us assume that for some reason it is desired to increase the 
critical size of the test, i.e., to make rejection of the hypothesis more probable. Then one 
may expect that an observation which led to rejection in the first test should still lead to 
rejection in the new test. In other words, one should expect W, > Wa.’ if a > a’ where W. 
is the critical region of size a. An example is given where regions of Type A fail to have 
this property. 


16. Stochastic Approximation. (Preliminary Report.) Hersert RospsBins AND 
S. Monro, University of North Carolina. 


We consider the general problem of estimating a constant associated with a function 
M(x), e.g., the root of an equation M(z) = a or the abscissa of a maximum of M(x). When 
M(x) is observable there are methods of determining the constant by ‘“‘successive ap- 
proximation.” We suppose, on the contrary, that M(x) is unknown but that to each value 
of x corresponds an observable random variable Y = Y(x) with distribution function 


P{Y(z) Ss y] = H(y| xz) such that M(z) = | y dH(y | x) is the expected value of Y for the 


given zx. In the case where M(z) is increasing and we wish to estimate the unique root, 
x = 6, of M(x) = a, we propose to let 2n41; = 2n + Gn(a — Yn), Where 2; is an arbitrary con- 
stant, {@,} is a sequence of positive constants, and y, is a random variable such 
that P[y,n S y | r,] = H(y | z,). One of us has shown under certain conditions on H(y | z) 
that, if {a,} is of the type {1/n}, then no matter what the initial value z 


lim E(z, — 0)? = 0, 

no 
so that z, is a consistent estimator of 6. Work is in progress to establish in special cases 
bounds on E(z, — 6)*. By replacing (a — yn) by sgn(a@ — yp), less severe restrictions need 
be imposed on H(y | z) in order to obtain the same result. This sequential type of design 
can be applied to estimation in regression problems, to ‘‘all or nothing’ response experi- 
ments (where y, is limited to the values 0 and 1), and to the experimental determination of 
the maximum of a function (cf. Harold Hotelling’s paper in Annals of Mathematical 
Statistics, vol. 12 (1941), pp. 20-45). (This research was done in part under an Office of 
Naval Research contract.) 


17. On the Properties and Statistical Purposes of Some Well-known and Some 
New Tests in Multivariate Analysis. S. N. Roy, University of North 
Carolina. 


Consider two problems of multivariate analysis, each of which could be made to cover 
a wide number of situations. (1) With two random samples of sizes n; and n2 and dispersion 
matrices (a:;;) and (a2;;) from two p-variate normal populations with dispersion matrices 
(a1s;) and (aa;), 7,7 = 1,2, --+ , p, an infinite number of similar region tests could be con- 
structed for the composite hypothesis (a:;;) = (a2:;), among which there is none having 
the strong optimum properties of the usual F-test in the anlogous univariate problem. 
Among these similar region tests, however, there is a subset based on F; ,i = 1, 2 


=2 . y 
9@ > »Ps 





NEWS AND NOTICES 317 


where F,’s are the roots of the equation in F : | a,;; — Fas;;| = 0, such that the largest root has 
moderate optimum properties with respect to one class of alternatives, the smallest for 
another class and the product of the roots (which is the likelihood ratio test) for another 
class of alternatives—all discussed in this paper. (2) With k random samples of sizes n, 
from k p-variate normal populations with means m,; and a common dispersion matrix 
(aij), 7 = 1,2, --- ,k;i,7 = 1,2, --- , p, an infinite number of similar region tests could 
be constructed for the composite hypothesis (m,;) = (ma) = +++ = (mpi), i = 1,2, +--+ , p, 
among which there is none having the strong optimum properties of the F-test in the analo 
gous univariate problem. Among these similar region tests, however, there is a subset based 
on F; ,i = 1,2, --- ,q & p, where F,’s are the nontrivial roots of the equations in F 
| bisg — Fai; | = 0 (where (b,;;) is the matrix of the sample means reduced to the grand 
means and (b;) is the pooled dispersion matrix from the different samples), such that 
the largest and smallest roots have moderate optimum properties with respect to two 
different classes of alternatives and the sum of the roots for a third class of alternatives— 
all discussed in this paper. The likelihood ratio test, however, leads to the product. The 
wide variety of situations each problem could be made to cover is also discussed in this 
paper. 


Ce a 


NEWS AND NOTICES 


Readers are invited to submit to the Secretary of the Institute news items of interest. 


Personal Items 


Dr. K. S. Banerjee, Statistician at the Central Sugarcane Research Station, 
Bihar, India, received his doctorate degree from the Calcutta University in 
January of this year. His thesis covered his contributions to ‘weighing designs.” 

Mr. Lyle D. Calvin, formerly at the Institute of Statistics, North Carolina 
State College, has accepted the position of Biometrician with the Division of 
Biological Research, G. D. Searle & Co., Chicago, Illinois. 

Dr. Robert J. Hader has accepted a position on the staff of the Institute of 
Statistics, North Carolina State College. He leaves Los Alamos, New Mexico, 
where he has been employed as statistician for the Los AlamosScientific Labora- 
tory for the past two years. 

Mr. Bernard Hecht has joined the Victor Division of RCA, Camden, New 
Jersey, as Manager, Assembly Quality Control, after five years as Quality Con- 
trol Manager of the International Resistance Company of Philadelphia, Penn- 
sylvania. 

Dr. Edward L. Kaplan has received his doctorate degree in mathematics 
from Princeton University and is now a member of the Technical Staff, Bell 
Telephone Laboratories, Murray Hill, New Jersey. 

Dr. Eugene Lukacs has joined the staff of the Statistical Engineering Labora- 
tory of the National Bureau of Standards. At the Bureau he will be engaged 
in research in mathematical statistics, particularly autoregressive series and 
stochastic processes. 

Mr. A. W. Marshall, formerly at the Washington, D. C., office of the Rand 
Corporation, has now moved to its Santa Monica, California, office. 





318 NEWS AND NOTICES 


Dr. J. E. Morton is on leave of absence from Cornell University and is serving 
as Chief, Statistical Research and Development Staff, Office of the Housing 
Administrator, Washington, D. C. 

Dr. J. Ernest Wilkins, Jr., formerly on the staff at the American Optical 
Company, is now with Nuclear Development Associates, Inc., 80 Grand Street, 
White Plains, New York. 

Dr. William J. Youden of the National Bureau of Standards has recently 
been elected to Fellowship in the New York Academy of Sciences. 


Ce RR 


Sigeiti Moriguti, Assistant Professor of Applied Mathematics at the University 
of Tokyo, is spending the academic year 1950-51 in research and study of 
mathematical statistics at the University of North Carolina under the sponsor- 
ship of the United States Army. He is the author of numerous research articles 
and a book on the theory of statistics. 


Statistics at Chicago 


The University of Chicago in 1949 established a Committee on Statistics 
which is in all respects equivalent to a department, having its own faculty, 
budget, and curriculum. Its purposes are research, instruction, and consulta- 
tion. Its faculty includes R. R. Bahadur, Milton Friedman, Leo A. Goodman, 
John Gurland, Tjalling C. Koopmans, William H. Kruskal, Harry V. Roberts, 
Murray Rosenblatt, Leonard J. Savage, Charles M. Stein, and W. Allen Wallis 
(Chairman). Among other statisticians at the University of Chicago are Walter 


Bartky (Physical Sciences), Donald W. Fiske (Psychology), Philip M. Hauser 
(Sociology), Paul R. Halmos (Mathematics), Karl J. Holzinger (Education), 
H. Gregg Lewis (Economics), Jacob Marschak (Cowles Commission for Re- 
search in Econometrics), William Stephenson (Psychology), Louis L. Thurstone 
(Psychology), Josephine Williams (National Opinion Research Center), and 
Sewall Wright (Zoology). 

The following courses are offered by the Committee: Introduction to Statis- 
tics; Statistical Inference (3 quarters); Introduction to Mathematical Probabil- 
ity; Introduction to Mathematical Statistics (2 quarters); Sample Surveys; 
Analysis of Variance and Regression; Estimation and Tests of Hypotheses; 
Statistical Theory of Decision-making; Theory of Minimum Risk; Sequential 
Analysis; Non-parametric Inference; Multivariate Analysis; Design of Experi- 
ments; Time Series; Statistical Problems of Model Construction; Limit The- 
orems of Probability Theory; Markov Processes; Mathematical Techniques of 
Statistics; and several seminars. In addition, a number of statistics courses are 
offered in other departments, e.g., Factor Analysis, Econometrics, Quality 
Control, Index Numbers, Biometrics, etc. 

Three kinds of degree may be obtained at Chicago in Statistics. (1) The M.A. 
or Ph.D. in a substantive field, with concentration in Statistics, is not administered 
by the Committee, but it cooperates fully with the substantive departments in 
these degree programs. (2) The M.A. in Statistics is awarded on the basis of 





NEWS AND NOTICES 319 


(i) a thesis, (ii) written examinations, and (iii) work in a minor field. (3) The 
Ph.D. in Statistics is awarded on the basis of (i) preliminary written examinations, 
(ii) work in a minor field, (iii) participation in statistical consultation, (iv) a 
dissertation, (v) a public lecture on the content of the dissertation, and (vi) a 
final oral examination. 


Summer Sessions in Berkeley, California 


This year’s summer program at the Statistical Laboratory of the University 
of California, Berkeley, California, consists of two sessions, June 18—July 28 
and July 30-September 8. The program includes four of the usual undergraduate 
courses, two in each session, and two graduate courses. One of the latter is a 
regular course of lectures on rank correlation methods and on time series analy- 
sis. The other graduate course is a seminar on time series and related problems. 
Both graduate courses will be given during the first summer session by Professor 
Maurice G. Kendall of the London School of Economics and Political Science. 
Professor J. Neyman will be available for consultations on work leading to 
higher degrees. In addition to the above two persons, the faculty of the summer 
sessions will include Dr. Grace E. Bates (Mount Holyoke College), Dr. Colin 
R. Blyth (University of Illinois), and Dr. Gottfried E. Noether (New York 
University). 

Summer Seminar in Statistics 


The second annual session of the Summer Seminar in Statistics will be held 
at the University of Connecticut, Storrs, Connecticut, August 6-31, 1951. The 
purpose of the Seminar is to stimulate general exchange of ideas by providing 
informal contacts and free discussions among academic statisticians, students, 
and users of statistical techniques. The principal session meets daily from 3:00 
p.m. to supper. The schedule of topics, together with the organizers of each 
week’s program, is as follows: 

August 6-10. Statistics in the Biological Sciences (C. I. Bliss and J. Ipsen); 

August 13-17. Time Series (M. G. Kendall and J. W. Tukey); 

August 20-24. Statistical Theory and Probability (M. Kac and H. Rob- 

bins); 

August 27-31. Statistical Techniques with Special Reference to the Social 

Sciences (F. C. Mosteller, F. L. Strodtbeck, and M. A. 
Woodbury). 
Frequent statistical clinics to discuss the solution of particular practical prob- 
lems are planned. 

Dormitory accommodations of single and double rooms are available at the 
University of Connecticut. Family groups of three or more must use other 
housing. It is hoped that a number of stipends to cover living expenses will be 
available on a competition basis to graduate students. Further information 
about the Seminar may be obtained from the Secretary of the Seminar: Pro- 
fessor D. F. Votaw, Jr., Department of Mathematics, Yale University, New 
Haven, Connecticut. 





NEWS AND NOTICES 


Doctoral Dissertations in Statistics, 1950 


Listed below are the doctorates conferred during the year 1950 in the United 
States and Canada for which the dissertations were written on topics in statistics. 
The university, month in which degree was conferred, major subject, minor sub- 
ject, and the title of the dissertation are given in each case if available. If any 
doctorate properly belonging in this list is omitted, the Editor would like the 
relevant information concerning such doctorate. It is planned to publish a list 
of doctorates in the June issue each year. 

R. R. Bahadur, North Carolina, June, major in mathematical statistics, minor 
in experimental statistics and mathematics, “On a Class of Decision Problems 
in the Theory of R Populations.” 

C. R. Blyth, California, June, major in mathematics, “I. Contribution to the 
Statistical Theory of the Geiger-Miller Counter. II. On Minimax Statistical 
Decision Procedures and Their Admissibility.” 

K. A. Bush, North Carolina, August, major in mathematical statistics, minor 
in mathematics and economics, “Orthogonal Arrays.”’ 

A. L. Finkner, North Carolina, major in experimental statistics, minor in 
agronomy, ‘Further Investigation on the Theory and Application of Sampling 
for Scarcity Items.” 


W. D. Foster, North Carolina, major in experimental statistics, minor in 
meteorology, ‘““On the Selection of Predictors: Two Approaches.”’ 

M. Halperin, North Carolina, August, major in mathematical statistics, minor 
in experimental statistics and mathematics, ‘‘Estimation in Truncated Sampling 


Processes.”’ 

H. M. Hughes, California, September, major in mathematics, “Estimation of 
the Variance of the Bivariate Normal Distribution.”’ 

P. E. Irick, Purdue, February, major in mathematics, minor in psychology, 
““A Geometric Study of the Exact Sampling Distribution of Standard Deviations 
When the Sampled Population Is Arbitrary.” 

S. L. Isaacson, Columbia, June, major in mathematical statistics, minor in 
mathematics, “On the Theory of Unbiased Tests of Simple Statistical Hypotheses 
Specifying the Values of Two or More Parameters.” 

E. H. Jebe, North Carolina, major in experimental statistics, minor in agri- 
cultural economies, ““The Theory and Application of the Selection of Primary 
Units for Sampling an Agricultural Population.” 

A. W. Kimball, Jr., North Carolina, major in experimental statistics, minor 
in mathematics, “Studies in the Statistical Design and Analysis of Microbiolog- 
ical Assays of Amino Acids.”’ 

G. E. McCreary, Iowa State College, June, major in statistics, minor in mathe- 
matics and economics, ‘‘Cost Functions for Sample Surveys.” 

L. E. Moses, Stanford, major in statistics, minor in mathematics, ‘‘An Iter- 
ative Construction of the Optimum Sequential Procedure When the Cost Func- 
tion Is Linear.” 

S. W. Nash, California, June, major in mathematies, “I. Contribution to the 





NEWS AND NOTICES 321 


Theory of Experiments with Many Treatments. II. On the Law of the Iterated 
Logarithm for Dependent Random Variables.” 

R. P. Peterson, California (Los Angeles), June, major in mathematics, “Certain 
Optimum Statistical Decision Methods.” 

M. Pizzi (Doctor of Public Health), Johns Hopkins, “An Approximate Solu- 
tion for the Standard Error of LD50 as Obtained by the Reed-Muench Method.” 

B. Sherman, Princeton, June, major in mathematics, “A Random Variable 
Related to the Spacing of Sample Values.” 

S. S. Shrikhande, North Carolina, August, major in mathematical statistics, 
minor in experimental statistics, ‘Construction of Partially Balanced Designs 
and Related Problems.” 

H. Solomon, Stanford, major in statistics, ‘‘Distribution of the Measure of a 
Two-Dimensional Random Set.” 

H. E. Teicher, Columbia, June, major in mathematical statistics, minor in 
mathematics, ‘‘On the Factorization of Distributions.” 

W. A. Vezeau, St. Louis, June, major in mathematics, “On the Product Distri- 
bution of Normally Distributed Variables.” 

S. A. Vora, North Carolina, June, major in mathematical statistics, minor in 
experimental statistics, ‘Bounds on the Distribution of Chi-Square.” 

J. T. Wakeley, North Carolina, major in experimental statistics, minor in 
meteorology, “On Linear Regression Method as Related to Long Time Experi- 
ments in Agricultural Climatology.” 


Cn a Re a 


New Members 


The following persons have been elected to membership in the Institute. 
(December 1, 1950 to February 28, 1951) 


Benktander, Gunnar, Fil. Kand. (Univ. of Stockholm), Actuary, Post Fack, Stockholm 
26, Sweden. 

Boll, C. H., B.S. (Stanford Univ.), Graduate student in Statistics, Stanford University, 
1247 Cowper, Palo Alto, California. 

Carey, T. M., Ph.D. (Univ. of London), Lecturer in Mathematics, University College, 
Cork, Ireland, ‘‘Duinin,’? Laburnum Park, Model Farm Road, Cork, Ireland. 

DeLancie, R. H., A.B. (Univ. of Calif.), Graduate student in Statistics, University of 
California, 1137 Colusa Avenue, Berkeley 7, California. 

Dighero, Oscar Alfonso Martinez, Graduate (Univ. of San Carlos, Guatemala), Civil 
Engineer, Chief, Division of Engineering and Architecture, Instituto Gautemalteco 
de Seguridad Social, 11 Avenida Norte No. 44, Guatemala, Guatemala, Central America. 

Ellery, J. B., M.A. (Univ. of Colo.), Graduate student and Teaching Assistant, Department 
of Speech, University of Wisconsin, 7 Tilton Terrace, Madison 4, Wisconsin. 

Esary, J. D., A.B. (Whitman College), Teaching Assistant, Statistical Laboratory, Uni- 
versity of California, 2534 Dwight Way, Berkeley 4, California. 

Esscher, Fredrik, Ph.D. (Univ. of Lund), Chief Actuary, Skandia Insurance Co., Stock- 
holm 2, Sweden. 

Grometstein, A. A., M.A. (Columbia Univ.), Industrial Statistician and Consulting Physi- 
cist, Sylvania Electric Products, Inc., 70 Forsyth Street, Boston, Massachusetts. 





2 PA MR Na TREAD INS 


REPORT OF OAK RIDGE MEETING 


Gross, F. A., B.S. (American Univ.), Statistician, Research Division, Bureau of Naval 
Personnel, Arlington, Virginia, 3409B New Merico Ave., N. W., Washington 16, D.C. 

Houthakker, H. S., Ph.D. (Univ. of Amsterdam), Research Officer, University of Cam- 
bridge, Department of Applied Economics, 8 Richmond Road, Cambridge, England. 

Hubbell, C. H., A.B. (Oberlin College), Box 467, Benjamin Franklin Station, Washington 
4D. ©. 

Kurtz, T. E., A.B. (Knox College), Research Assistant, Mathematics Department, Fine 
Hall, Box 708, Princeton University, Princeton, New Jersey. 

Lanteli, Gunnar, Fil. Kand. (Univ. of Lund), Actuary of Férsakringsaktiebolaget Hansa, 
Stockholm 7, Sweden. 

Matern, Bertil, Fil. Lic. (Univ. of Stockholm), Assistant Professor, Swedish Forest Re- 
search Institute, Lappkarrsvagen 47, Stockholm 50, Sweden. 

McCall, Jr., C. H., A.B. (George Washington Univ.), Assistant in Statistics, 6701-44th 
Street, Chevy Chase 15, Maryland. 

McHugh, R. B., M.A. (Univ. of Minn.), Assistant Professor of Psychology and Statistics, 
Iowa State College, 222 Stanton, Ames, Iowa. 

Meier, Paul, M.A. (Princeton Univ.), Statistician, and Research Secretary, Philadelphia 
Tuberculosis and Health Association, 39 Vandeventer Avenue, Princeton, New Jersey. 

Owen, D. B., M.S. (Univ. of Wash.), Research Associate in Mathematics, University of 
Washington, 612 West 85th Street, Seattle 7, Washington. 

Poch, F. A., Licenciado en Ciencias (Univ. of Madrid), Official, Instituto Nacional de 
Estadistica; Specialist of Section of Methodology; Assistant Professor, Mathematical 
Statistics, University of Madrid, Madrid, Spain. 

Rios, Sixto, Ph.D., Professor of Mathematical Statistics, University of Madrid; Chief, 
Department of Statistics of the Superior Council of Scientific Research, Madrid, Spain. 

Rosenbaum, S. Z., B.S. (Univ. of Chicago), Research Director, Community Welfare Council 
of Milwaukee County, 2965 N. 8lst Street, Milwaukee 10, Wisconsin. 

Saxen, Tryggwe, B.Sc. (Univ. of Helsingfors), Fil. Kand., Assistant Actuary of Industrial 
Accident Insurance Co., Kasarngatan 44V, Helsingfors, Finland 

Shapiro, Arthur, A.B. (Brooklyn College), 66334 Telegraph Avenue, Oakland 9, California. 

Shimizy, Kunio, B.A. (Univ. of British Columbia). Statistician, Institutions Section, 
Dominion Bureau of Statistics, Ottawa, Ontario, Canada. 

Soda, Takemune, M.P.H. (Johns Hokpins Univ.), Chief, Division of Health and Welfare 
Statistics, Ministry of Welfare, Tokyo; Chief, Department of Epidemiology, Institute 
of Public Health, Tokyo, 808 Den’en-Chofu 2-Chome, Ohta-ku, Tokyo, Japan. 

Strodtbeck, F. L., Ph.D. (Harvard Univ.), Assistant Professor of Sociology, Yale Uni- 
versity, 576 Whalley Avenue, New Haven 11, Connecticut. 

Taranger, Aksel, B.Sc. (Univ. of Wis.), General Sales Manager, Norsk Aluminium Com 
pany, Loekkeveien 9, Oslo, Norway. 

van Dantzig, D., Ph.D. (Groningen, Netherlands), Head of Department of Statistics, 
Mathematical Centre, Amsterdam; Professor, University of Amsterdam, Valeriusstraat 
58, Amsterdam-Zuid, Netherlands. 

Varnum, E. C., M.S. (Univ. of Mich.), Mathematician, Barber-Colman Company, Rock- 
ford, Illinois. 


(a rR 


REPORT OF THE OAK RIDGE MEETING OF THE INSTITUTE 


The forty-sixth meeting of the Institute of Mathematical Statistics was held 
jointly with the Biometric Society (Eastern North American Region) at the 





REPORT OF OAK RIDGE MEETING 323 


Oak Terrace in Oak Ridge, Tennessee, on March 15-17, 1951. The registration 
was one hundred four, including the following members of the Institute: 


G. E. Albert, Kenneth J. Arnold, Joseph Berkson, R. C. Bose, R. A. Bradley, A. E. 
Brandt, Irwin J. Bross, Lyle D. Calvin, Osmer Carpenter, Jerome Cornfield, Phelps P. 
Crump, Edward E. Cureton, D. B. Duncan, Arthur M. Dutton, Meyer Dwass, Churchill 
Eisenhart, Evelyn Fix, B. G. Greenberg, Samuel W. Greenhouse, Robert J. Haden, Boyd 
Harshbarger, Alston S. Householder, Oscar Kempthorne, Allyn W. Kimball, Jacob E. 
Lieberman, Joseph Mandelson, Nathan Mantel, Margaret P. Martin, Herbert A. Meyer, 
Jack Moshman, T. Ellison Neal, H. W. Norton, I. Olkin, D. A. Probst, John H. Reynolds, 
S. N. Roy, Marvin Schneiderman, Esther Seiden, H. Fairfield Smith, Melvin D. Springer, 
William F. Taylor, D. Teichroew, M. E. Terry, Donovan J. Thompson, David V. Tiedeman, 
John W. Tukey, Myron J. Willis, W. J. Youden. 


Professor Paul Densen, University of Pittsburgh, presided at the first Insti- 
tute session, Thursday afternoon, March 15, devoted to Public Health Statistics 
Papers read were: 


1. A Simple Stochastic Model of Recovery, Relapse, Death and Loss of Patients. Evelyn 
Fix and Jerzy Neyman, University of California, Berkeley. 

2. An Elementary Stochastic Process for a Syphilis Population. B. G. Greenberg, Uni- 
versity of North Carolina. 


The Oak Ridge National Laboratory acted as host at a smoker and beer party 
in honor of the visitors at the Ridge Recreation Hall on Thursday evening. 
About seventy-five people were present. 

On Friday morning Dr. W. A. Arnold, Oak Ridge National Laboratory, 


acted as chairman at a session on Bioassay with Quantal Responses. Papers 
presented were: 


1. Why I Prefer Logits to Probits and Sinits. Joseph Berkson, Mayo Clinic. 
2. How Much Does the Choice of Metameter Matter? J. W. Tukey, Princeton University, 
3. Extensions of Elementary Methods in Bioassay. Irwin Bross, The Johns Hopkins 
University. 
Jerome Cornfield, National Institute of Health, initiated a lively discussion. 
Professor C. L. Comar, University of Tennessee, was chairman of the meet- 
ing Friday afternoon concerning Experimental Design. The following papers 
were presented: 


1. Incomplete Block Designs. R. C. Bose, University of North Carolina. 

2. Fractional Replication. Oscar Kempthorne, Iowa State College. 

3. The Analysis of Long Term Experiments. A. M. Dutton, Iowa State College. 

4. Testing-for-Preference Experiments. L. D. Calvin, G. D. Searle and Company, Chicago. 


About seventy people were present at a banquet sponsored by both organiza- 
tions. Dr. E. R. MeCrady, U. 8. Atomic Energy Commission, acted as toast- 
master and introduced Dr. A. M. Weinberg, Research Director of the Oak 
Ridge National Laboratory. Dr. Weinberg welcomed the visitors and spoke 
briefly of the analogy between nuclear physics and biometrics (in the broad 
sense) and stochastic processes in nuclear research. 





1 a Pn eS i aa rN EP 


REPORT OF OAK RIDGE MEETING 


The first session Saturday morning on Multivariate Analysis was presided 
over by Professor E. E. Cureton, University of Tennessee. Papers presented 
were: 


1. 


2 


3. 


On the Properties and Statistical Purposes of Some Well-known and Some New Tests 
in Multivariate Analysis. S. N. Roy, University of North Carolina. 


. Some Applications of Compound Symmetry Tests. A. W. Kimball, Oak Ridge National 


Laboratory. 
Some Preliminary Results of Multivariate Discriminant Analysis. D. V. Tiedeman, 
Harvard University. 


There followed two concurrent sessions of contributed papers. Dr. A. S. House- 
holder, Oak Ridge National Laboratory, presided at the first session, at which 
the following papers were contributed: 


‘. 


Confidence Intervals for the Mean Rate at Which Radioactive Particles Impinge on a 
Type I Counter. Preliminary Report. G. E. Albert, University of Tennessee and Oak 
Ridge National Laboratory. 


. A Problem of Elapsed Times in a Sequence of Events. Osmer Carpenter, Carbide and 


Carbon Chemicals Division, Oak Ridge. 


. On the Existence of Unbiased Tests for Testing Composite Hypotheses. Esther Seiden, 


University of Buffalo. 


. Group Divisible Incomplete Block Designs. R. C. Rose, University of North Carolina. 
. Orthogonal Arrays of Strength Two and Three. R. C. Bose and Kenneth A. Bush, Uni- 


versity of North Carolina. 


. The Structure of Balanced Incomplete Block Designs, and the Impossibility of Certain 


Unsymmetrical Cases. (By Title.) William 8. Connor, University of North Carolina. 


. Some Bounded Significance Level Tests for the Median. (By Title.) John E. Walsh, 


The Rand Corporation. 


Professor W. 8. Snyder, University of Tennessee, presided at the second session 
Saturday morning, at which the following contributed papers were presented: 


4. 


9 
= 


Joint Sampling Distribution of the Mean and Standard Deviation for Distribution 
Functions of the First Kind. Melvin D. Springer, U. 8S. Naval Ordnance, Indianapolis. 


. On Certain Distribution Problems in Multivariate Analysis. Preliminary Report. 


Ingram Olkin, University of North Carolina. 


. A Unified Approach to a Wide Class of Distribution Problems in Multivariate Analysis. 


S. N. Roy, University of North Carolina. 


. An Extension of the Buffon Needle Problem. Nathan Mantel, National Cancer In- 


stitute. 


. A Generalization of Sampling without Replacement from a Finite Universe. D. G. Hor- 


vitz and D. J. Thompson, Iowa State College. 


. A Problem in Two-Stage Sampling. (By Title.) B. M. Seelbinder, University of North 


Carolina. 


. Bounds on a Distribution Which Are a Function of Moments to Order Four. Preliminary 


Report. (By Title). Marvin Zelen, University of North Carolina. 


. An Inconsistency among Type A Regions. (By Title.) Herman Chernoff, University of 


Illinois. 


. Stochastic Approximation. Preliminary Report. (By Title.) Herbert Robbins and 


S. Monro, University of North Carolina. 





PUBLICATIONS RECEIVED 325 


Additional special events of the meeting included a tour through the American 

Museum of Atomic Energy and a tea for the ladies given by Mrs. A. W. Kimball. 
JacK MosHMAN 

Assistant Secretary 


0 ee 


PUBLICATIONS RECEIVED 


BuTTeERBAUGH, GranT I., A Bibliography of Statistical Quality Control, Supplement, Uni- 
versity of Washington Press, Seattle, 1951, 141 pp., $2.00. n 
Tables d’Intéréts et d’Annuités, Crédit Communal de Belgique, Brussels, 1950, 163 pp. 





ah CORRE NOD ETC IAAL PEON OLR ACLS ARNG SCS LOE PO EI ce 





BIOMETRIKA 
A Journal for the Statistical Study of Biological Problems 


Volume 38 Contents Parts 1 and 2, June 1951 


1. Major Greenwood (with portrait). By P. L. McKINLAY. 2. Partial and multiple rank correlation. 
By P. A. P. MORAN. 3. Some questions of distribution in the theory of rank correlation. By S.T. DAVID, 
M. G. KENDALL, and A. STUART. 4. On distributions for which the Hartley-Khamis solution of the 
moment-problem is exact. By H. P. MULHOLLAND. 5. The effect of non-normality on the power func- 
tion of the F-test in the analysis of variance. By F. N. DAVID and N. L. JOHNSON. 6. Regression, 
structure and functional relationship. By M. G. KENDALL. 7. An application of the distribution of 
ranking concordance coefficient. By A. STUART. 8. Some tests for randomness in plant populations. 
By MARJORIE THOMAS. 9. The geometry of estimation. By J. DURBIN and M. G. KENDALL. 
10. The frequency distribution of the product-moment correlation coefficient in random samples of any size 
drawn from non-normal universes. By A. K. GAYEN. 11. Note on the exact treatment of contingency 
goodness of fit and other problems of significance. By G. H. FREEMAN and J, H. HALTON. 12. Effi- 
ciency of the method of moments and the Gram-Charlier type A distribution. By L. R. SHENTON. 13. 
Tables of the 5 and 0.5% points of Pearson curves (with argument @; and 8s) expressed in standard measure. 
By E. 8S. PEARSON and M. MERRINGTON. 14. Random dispersal in theoretical populations. By J. 
G. SKELLAM., 15. Estimation problems when a simple type of heterogeneity is present in the sample. 
By W.M.LONG. 16, The Jacobians of certain matrix transformations useful in multivariate analysis (based 
on P. L. Hsu’s lectures). By W. L. DEEMER and I. OLKIN. 17. Testing for serial correlation in least 
squares regression, II. By J. DURBIN and G. 8. WATSON. 18. Bi-variate k-statistics and cumulant, 
of their joint sampling distribution. By M. B. COOK. 19. Charts of the power function for analysis of 
variance tests, derived from the non-central F-distribution. By E. 8. PEARSON and H. O. HARTLEY. 
20. A chart for the incomplete Beta-function and the cumulative binomial distribution. By H.O. HART- 
LEY and E. R. FITCH. 21. MISCELLANEA. 22. REVIEWS. 


The subscription price, payable in advance, is 45s. inland, 54s. export (per volume including postage). Cheques 
should be drawn to Biometrika and sent to “The Secretary, Biometrika Office, Department of Statistic, 
University College, London, W.C. 1." All foreign cheques must be in sterling and drawn on a bank 
having a London agency. 





SKANDINAVISK 
AKTUARIETIDSKRIFT 


1950 - Parts 3 - 4 
Contents 


Martin WEIBULL 
The Distribution of the ¢ and z Variables in the Case of Stratified Sample 
with Individuals Taken from Normal Parent Populations with Vesyias 
Means 
A. Hap anp 8. A. Stnksax. A Table of Percentage Points of the x?-Distribution 
K.-G. HaGsTRoeEo . Risk Theory and Group Insurance 
Lars DAHLGREN 
A Theorem on Translations by Hille, and Its Interpretation from the Point 
of View of the Theory of Probability 
J. F. SreErrFENSEN More about Invalidity Functions 
Tore DALENIUS ... The Problem of Optimum Stratification 
STren MAtmMQuist 


On a Property of Order Statistics from a Rectangular Distribution 
Annual subscription: 10 Swedish Crowns (Approx. $2.00). 
Inquiries and orders may be addressed to the Editor, 


SKARVIKSVAGEN 7, DJURSHOLM (SWEDEN) 





a eee a eenerta 
ROYAL STATISTICAL SOCIETY | 


SPECIAL REPRINTS 


SYMPOSIUM ON STOCHASTIC PROCESSES: 
Stochastic Processes and Statistical Physics, J. E. Moyar 
Some Evolutionary Stochastic Processes, M.S. BARTLETT 
Stochastic Processes and Population Growth, D.G. KENDALL 
(With Discussion on the papers) 
Price, post jree 12s. 6d. 
TABLES OF SEQUENTIAL INSPECTION SCHEMES TO 
CONTROL FRACTION DEFECTIVE, F. J. ANscomBe 
Price, post free, 2s. 6d. 
These papers, published in the Journal of the Royal Statistical Society, 


1949 have now been issued in reprint form. Copies may be obtained 
direct from 


The Royal Statistical Society, 
4, Portugal Street, 
London, W.C.2. 


MATHEMATICAL REVIEWS 


A journal containing reviews of the mathematical liter- 
ature of the world, with full subject and author indices 


Publication of this journal is sponsored by the American Mathe- 
matical Society, Mathematical Association of America, Institute of 
Mathematical Statistics, London Mathematical Society, Edinburgh 


Mathematical Society, Union Matematica Argentina, and others. 


Subscriptions accepted to cover the calendar year only. 
Issues appear monthly except July. $20.00 per year. 
Send subscription order or request for sample copy to 


AMERICAN MATHEMATICAL SOCIETY 
531 West 116th Street, New York City 27 





SANKHYA 
The Indian Journal of Statistics 
Edited by P. C. Mahalanobis 


Vol. X, Part 3, 1950 
Why Statistics? ... : psttahiiesanta.s P. C. MAHALANOBIS 
Statistical Inference Applied to Classificatory Problems C. RADHAKRISHNA Rao 


A Note on the Distribution of Di, — Di and Some Computational Aspects of 
D? Statistic and Discriminant Function C. RaADHAKRISHNA Rao 
Assumptions Underlying the Use of the Tetrachoric Correlation Coefficient 
SunpRI VAswaNI 
Some Suggestions Regarding the 1951 Indian Census Questionnaire 


P. MUKHERJEE 


Annual subscription: 30 rupees 
Inquiries and orders may be addressed to the 
Editor, Sankhy&, Presidency College, Calcutta, India. 


ECONOMETRIC PAPERS 


A Supplement to Econometrica 


just published. 340 pages. $2.50 


Report of the Washington Meeting of the Econometric Society, Volume V 
of Proceedings of the International Statistical Conferences held in 
Washington, D. C., September 6-18, 1947 

Contributors: Allais, Amoroso, Anderson, Chait, C. Clark, Derksen, Di- 
visia, Domar, Dumontier, Friedman, Geary, Georgescu-Roegen, 
Henon, Hurwicz, Jantzen, Kendall, Koopmans, 8. Kuznets, Lange, 
Leontief, LutfaHa, Massé, Metzler, Perroux, Rocard, Roos, Roy, 
Rueff, Shirras, Smithies, Stafford, Steinhaus, Stone, Tinbergen, 
Wold, and others 


THE ECONOMETRIC SOCIETY 
University of Chicago Chicago 37, Illinois 





JOURNAL OF THE March 1951 
AMERICAN STATISTICAL ASSOCIATION Vol. 46 No. 253 
1108 16th Street, N. W., Washington 6, D. C. 


Undergraduate Statistical Education S. S. WILKs 
The Influence of Statistical Methods for Research Workers on the Development of the Science 
of Statistics F. YATES 
The Impact of R. A. Fisher on Statistics Haro_p HOTELLING 
The Fisherian Revolution in Methods of Experimentation W. J. YoupEN 
R. A. Fisher’s Statistical Methods for Research Workers KENNETH MATHER 
The Theory of Statistical Decision L. J. SAVAGE 
The Kolmogorov-Smirnov Test for Goodness of Fit.... FRANK J. MAsSEY, Jr. 
A Large Sample ¢-Statistic Which Is Insensitive to Non-Randomness Joun E. WALsH 
A Short-Cut Measure of Correlation cae ie WiLtraM A. SPuRR 
On Stratification and Optimum Allocations W. Duane Evans 
Sampling with Probabilities Proportional to Size NATHAN KEyFITz 
A Source of Bias in One of the Samples of the 1950 Census PETER QO. STEINER 
The Distribution of Blocks in an Uncongested Stream of Automobile Traffic 
Morton S. RAFF 
Willard Phillips, a Predecessor of Paasche in Index Number Formulation 


Roy W. JAasTRAM 


The American Statistical Association invites as members all per- 


sons interested in: 
1. development of new theory and method 
2. improvement of basic statistical data 
3. application of statistical methods to practical problems. 





* Mamberchip tae Fedudinig 

ve -ATATISTICS are $10.00 bare : 
year elsewhere. 

; Mhould, be aont to tha 


51981, 
2 To be held in conjur 
‘and the Mathematical A 








